Library Usage
=============

File Analyzer
-------------

Analyze file
~~~~~~~~~~~~

To create analyzer object, use:

.. code-block:: python

    from spoonbill import  FileAnalyzer
    from spoonbill.common import ROOT_TABLES, COMBINED_TABLES

    analyzer = FileAnalyzer(
        '.',
        schema=path_to_schema,
        root_tables=ROOT_TABLES,
        combined_tables=COMBINED_TABLES,
        language='en',
        table_threshold=5,
    )

To analyze file and track progress, use:

.. code-block:: python

    for bytes_read, count in analyzer.analyze_file(path_to_file):
        print(f'analyzed {count} ({bytes_read})')

Storing state
~~~~~~~~~~~~~

To dump state file after analysis, use:

.. code-block:: python

    analyzer.dump_to_file('analyzed.state')

.. Note::

    This sile may be re-used for new instance of analyzer. Can be used to omit analysis step in case of multiple flatteting of the same file.

To restore from state, use:

.. code-block:: python

    analyzer = FileAnalyzer('.', state_file='analyzed.state')

Flattener
---------

Flattening options
~~~~~~~~~~~~~~~~~~

To create flattening options and extract only table and split if its possible,(*for example, tenders*) use:

.. code-block:: python

    from spoonbill.flatten import FlattenOptions

    options = FlattenOptions({"selection": {"tenders": {"split": True}}})

To select multiple tables (*for example, tender and parties*), use:

.. code-block:: python

    from spoonbill.flatten import FlattenOptions

    options = FlattenOptions(**{
        "selection": {
            "tenders": {"split": True},
            "parties": {"split": True}
        }
    })

Flatten file
~~~~~~~~~~~~

To flatten file, use:

.. code-block:: python

    from spoonbill import FileFlattener

    flattener = FileFlattener(
        '.',
        options,
        analyzer,
        csv=True, # Generate csv files
        xlsx=True, # Generate xlsx files
        language='en',
    )

    for count in flattener.flatten_file(filename):
        print(f'Flattened {count} items')

.. note::

    Please note that flattening routine requires data to be analyzed beforehand.