Changelog ========= 1.1.0 (2024-09-15) ------------------ Changed ~~~~~~~ - Some arguments must be keyword arguments in: - ``spoonbill.FileAnalyzer.analyze_file`` - ``spoonbill.FileFlattener.__init__`` - ``spoonbill.spec.Table.missing_rows`` - ``spoonbill.spec.Table.available_rows`` - ``spoonbill.stats.DataPreprocessor.__init__`` - ``spoonbill.stats.DataPreprocessor.init_tables`` - ``spoonbill.stats.DataPreprocessor.process_items`` - ``spoonbill.utils.iter_file`` - Drop support for Python 3.8. 1.0.13 (2024-09-04) ------------------- Added ~~~~~ - Add support for Python 3.12. Changed ~~~~~~~ - Drop support for Python 3.7. 1.0.12 (2023-09-19) ------------------- Fixed ~~~~~ - Fixed error if no releases. 1.0.11 (2023-08-17) ------------------- Added ~~~~~ - Add support for Python 3.11. Changed ~~~~~~~ - Drop support for Python 3.6. 1.0.10 (2022-04-07) ------------------- Fixed ~~~~~ * Fixed fallback for unsupported locale * Fixed unclosed file 1.0.9b10 (2021-10-17) --------------------- Fixed ~~~~~ * Fixed exception with flattening additional array of decimals (`ae55c `__) 1.0.9b9 (2021-10-15) -------------------- Fixed ~~~~~ * Fixed tool raising exception with UK tender data (`194638 `__) 1.0.9b8 (2021-09-22) -------------------- Fixed ~~~~~ * Headers for splitted child table (`2c97e6f `__) 1.0.8b8 (2021-09-20) -------------------- Fixed ~~~~~ * Empty columns after split of child table (`0ec4e97 `__) 1.0.7b8 (2021-09-20) -------------------- Changed ~~~~~~~ * new analyzer engine (`58c4673 `__) Fixed ~~~~~ * fix pretty title generation for count columns (`637b876 `__) * **cli** fixed dump analyzed data to state file (`608f03d `__) * **flatten** fixed splitting behavior (`0514883 `__) 1.0.7b7 (2021-09-10) -------------------- Fixed ~~~~~ * additional formatting of human readable titles (`538592c `__) 1.0.6b7 (2021-09-09) -------------------- Added ~~~~~ * human readable titles are extracted from schema (`09f6f8a `__) 1.0.6b6 (2021-07-14) -------------------- Fixed ~~~~~ * combined table's extensions are missing from export (`b1e897b `__) * table's extensions are missing from export (`6ebd662 `__) * `JsongrefError` for record schema (`b2d6066 `__) * locale is set to EN if it's absent (`b5d2aa9 `__) 1.0.5b6 (2021-07-14) -------------------- Added ~~~~~ * ordering tables according to schema (`894d399 `__) 1.0.5b5 (2021-06-24) -------------------- Added ~~~~~ * multiple file input support (`225c570 `__) 1.0.5b4 (2021-06-18) -------------------- Fixed ~~~~~ * `.gz` format recognition enhancement (`9283ba4 `__) 1.0.4b4 (2021-06-17) -------------------- Added ~~~~~ * added `.gz` support (`ed60751 `__) 1.0.4b3 (2021-06-15) -------------------- Fixed ~~~~~ * pass `multiple_values` via `DataPreprocessor` (`509a06d `__) 1.0.3b3 (2021-06-15) -------------------- Added ~~~~~ * added jsonl support (`59ec81c `__) Fixed ~~~~~ * fix `FileFlattener` input (`acacd87 `__) 1.0.3b2 (2021-06-07) -------------------- Added ~~~~~ * add Row and Rows containers to keep rows data and their relations (`4e8a385 `__) Fixed ~~~~~ * **cli:** fixed variable shadowing in a loop (`1a55141 `__) * fix parentTable generation for combined tables (`5e06bf0 `__) * parentID should be rowID for parent table (`c429309 `__) * .xlsx writer ``only`` error handling (`ebc2ad0 `__) * **setup:** add include_package_data to metadata (`db8b63b `__) 1.0.2b1 (2021-06-02) -------------------- Fixed ~~~~~ * **analyze:** recalculate headers recursively (`ca1c521 `__) * **stats:** pregenerate headers for exstention table when detected (`648485c `__) * **stats:** fix inserting array columns into rolled up table columns (`d6d6195 `__) * Use correct type annotation for List (`9d16a3f `__) 1.0.1b1 (2021-05-27) -------------------- Fixed ~~~~~ * **flatten:** strict columns match in only option 1.0.0b1 (2021-05-26) -------------------- Added ~~~~~ * **cli:** add --unnest-file, --repeat-file and --only-file options (`9b024e2 `_) * **cli:** add click integration with logging (`3c1184f `_) * **cli:** add informational messages about only, unnest and repeat (`2e6d48e `_) * **cli:** add language option (`1d89e0b `_) * **cli:** add progressbar when analyze file (`49e4440 `_) * **cli:** enable only and repeat options (`8b82f9e `_) * **cli:** use click.progressabr in heavy operations (`1e27a09 `_) * **cli:** use csv and xlsx options to provide output paths (`bf8689d `_) * **csv:** more exception handling in csv writer (`9e85095 `_) * **flatten:** add exclude option to remove table from export (`26025dd `_) * **flatten:** implement only option to specify list of output cols (`a57200b `_) * **i18n:** add custom babel extractor to produce schema paths (`f602a69 `_) * **i18n:** add locale override option when using gettext (`638b9a8 `_) * **i18n:** use localization mechanism as tool to generate h/r titles (`5e20df3 `_) * add ability to rename sheet (`9d4c68d `_) * add DataPreprocessor restore method to init from existing data (`1c3ada7 `_) * implement --state-file option to restore analyzer state from file (`a8294ea `_) * make DataPreprocessor.process_items iterable to track progress (`380196f `_) * table threshold option now enabled by default (`42283e6 `_) Changed ~~~~~~~ * Add lru_cache for common_prefix, and compare len() instead of using min() and max() (`694135c `_) * Use pickle instead of json (`63a4265 `_) Fixed ~~~~~ * **cli:** drop --split option and introduce --exclude (`35f1391 `_) * use pkg_resources.resource_filename to access locales (`be48d77 `_) * **stats:** fix IndexError when generating preview_rows for extra tables (`82b179b `_) * **utils:** make resolve_file_uri understand pathlib.Path (`51e82a3 `_) * use pickle instead of json in DataPreprocessor dump (`d0c516b `_) * **writers:** make writers context managers (`18e4c09 `_) * add more logging messages (`9205217 `_) * added logger filter for repetitive messages (`f936d50 `_) * added table abbreviation support (`85f46f3 `_) * CLI export message edit - removed extra tables from message, added list of exported tables and number of rows for each (`9681c71 `_) * CLI index out of range error, issue `#66 `_ (`0318558 `_) * code refactor; added duplicate check to stats/DataPreprocessor (`fcfb611 `_) * fix crash with additional array of strings present in data (`4e73c70 `_) * fix KeyError with adding count column in child tables (`36d5ccc `_) * fixed bug with regenerated headers when array is shorter than table_threshold (`3e87b4c `_) * fixed KeyError when flattening data with additional arrays (`c7e3cd0 `_) * increment default columns when incrementing table rows (`3c602a6 `_) * make name '_' explicit imported (`99932e0 `_) * strip lines when reading option file (`e57031b `_) * use OrderedDict as map container in iter_file (`0d1df1b `_) * writing booleans to .xlsx cells (`1d8de32 `_) * **cli:** enable --threshold option (`852ff92 `_) * **cli:** fix variable naming (`c17ca63 `_) * **flaten:** fixed typo JOINABLE -> JOINABLE_SEPARATOR (`1adc440 `_) * **flatten:** fix only option causing empty output (`c8447b0 `_) * **flatten:** fix repeat spreading to unrelated tables (`2e16c30 `_) * **i18n:** generate message for count columns (`a527f8d `_) * **setup:** run babel commands via pybabel (`e449c37 `_) * fixed mixing preview_rows and preview_rows combined (`dd1dd19 `_) * fixed serialization of total_items (`055ff65 `_) * remove copy column by reference in recalculate headers (`22c63f8 `_) * **stats:** respect with_preview when appending new preview row (`cfd8663 `_)