Metadata-Version: 1.2
Name: pandas_diff
Version: 1.1.0
Summary: Python utility to extract differences between two pandas dataframes.
Home-page: https://github.com/jaimevalero/pandas_diff
Author: Jaime Valero
Author-email: jaimevalero78@gmail.com
License: MIT license
Description: Pandas Diff
        ===========
        
        |CodeFactor| |Python 3|
        
        Installation
        ------------
        
        Install pandas_diff with pip
        
        .. code:: bash
        
           pip install pandas_diff
        
        Usage/Examples
        --------------
        
        .. code:: python
        
           import pandas_diff as pd_diff
        
           import pandas as pd
        
           # Create two example dataframes
           df_infinity = pd.DataFrame([
                           {"hero" : "hulk" , "power" : "strength"},
                           {"hero" : "black_widow" , "power" : "spy"},
                           {"hero" : "thor" , "hammers" : 0 },
                           {"hero" : "thor" , "hammers" : 1 } ] )
           df_endgame = pd.DataFrame([
                           {"hero" : "hulk" , "power" : "smart"},
                           {"hero" : "captain marvel" , "power" : "strength"},
                           {"hero" : "thor" , "hammers" : 2 } ] )
        
           # Get differences, using the key "hero"
           df = pd_diff.get_diffs(df_infinity ,df_endgame ,"hero")
        
           df
        
           #operation object_keys  object_values                     object_json                     attribute_changed old_value new_value
           #0   create     [hero]    captain marvel  {'hero': 'captain marvel', 'power': 'strength'...           NaN           NaN      NaN
           #1   delete     [hero]       black_widow  {'hero': 'black_widow', 'power': 'spy', 'hamme...           NaN           NaN      NaN
           #2   modify     [hero]              thor     {'hero': 'thor', 'power': nan, 'hammers': 2.0}       hammers             1        2
           #3   modify     [hero]              hulk  {'hero': 'hulk', 'power': 'smart', 'hammers': ...         power      strength    smart
        
        Why pandas diff ? Cases of use
        ------------------------------
        
        Migrating from batch to an event driven architecture
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        
        In my work, we use a lot of data pipelines to get info from external
        platforms, (active directory, github, jira). We load the new data
        replacing the entire table.
        
        By using pandas_diff we detect how the infraestructure changes between
        executions, and stream those change events into a kafka cluster, so
        other teams could suscribe to their favourite events. Also, by defining
        a pandas_diff step in the master pipeline, every item in our project has
        ther life cycle events controlled.
        
        Events log
        ~~~~~~~~~~
        
        For every item in a table, by using pandas_diff you will have an event
        log of how the resources are being consumed.
        
        Roadmap
        -------
        
        -  Support for stand alone app
        -  Blacklist of columns
        
        Documentation
        -------------
        
        `Documentation <https://pandas-diff.readthedocs.io/en/latest/>`__
        
        .. |CodeFactor| image:: https://www.codefactor.io/repository/github/jaimevalero/pandas_diff/badge
           :target: https://www.codefactor.io/repository/github/jaimevalero/pandas_diff
        .. |Python 3| image:: https://pyup.io/repos/github/jaimevalero/pandas_diff/python-3-shield.svg
           :target: https://pyup.io/repos/github/jaimevalero/pandas_diff/
        
        
        
        
        History
        -------
        
        0.7.18 (2021-12-05)
        -------------------
        
        \* Add codacy badge 
        
        0.7.18 (2021-12-05)
        -------------------
        
        \* Add codacy badge 
        
        0.7.19 (2021-12-05)
        -------------------
        
        \* Add codacy badge 
        
        0.7.19 (2021-12-05)
        -------------------
        
        \* Feat filter column 
        
        0.7.20 (2021-12-05)
        -------------------
        
        \* Feat filter column 
        
        0.7.21 (2021-12-05)
        -------------------
        
        \* Add filter fest 
        
        0.7.22 (2021-12-06)
        -------------------
        
        \* Add confition keys exist in df's 
        
        
        1.1.00 (2021-12-06)
        -------------------
        
        \* Add confition keys exist in df's
        
Keywords: pandas_diff
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.6
