-
Read data from MongoDB dumped gzip file (file with extension: xxx.bson.gz) using Python
- Python module required: pymongo (version 3.11.4), pandas (version 1.2.1)
- use bson_reader() function in bson_dump_read.py to convert xxx.bson.gz file into file_iterator or file list
- use bson_to_dateframe() function in bson_dump_read.py to convert the file_iterator into a dataframe with batch read in.
-
Plot large size network graph using Python and Gephi
- Read the edges .csv file into dataframe using pandas
- save it as xxx.gexf file using dataframe_to_gexf() function in Gephi_to_dataframe.py script, then open it in Gehpi application, play with the layout until you are satisfied
- save your satisfied layout as processsed_xxx.gexf file
- use gexf_to_dataframe() function in Gephi_to_dataframe.py to convert gexf file to a dataframe
- plot the network nicely using the coordinates extracted from the processed_xxx.gexf file using network_plot() function in Gephi_to_dataframe.py
- see the example here
-
Plot large network graph using Tableau
- create the "df_tableau_network.csv" using functions in script Tableau_file.py (see details inside the .py file)
- use Tableau to read in the saved csv file "df_tableau_network.csv" as txt file, select an empty sheet in Tableau
- drag X into Columns, drag Y twice into Rows, set all X and Y into dimension data types
- on Marks section, change Y(2) to Line type
- right click one of the Y, select "Dual Axis"
- right click the merged Y, select "Synchronize Axis"
- add "key" column to Y(2) detail
- fine-tuning the edge thickness, color, and nodes thickness, color, etc.
- see the example here