Dedicated repository to store examples of pre-compiled datasets #5

fititnt · 2022-01-04T02:26:57Z

Even if we do not do something such as precompile all public UN P Codes (here not focused on GIS, but their metadata, such as name) we would still need data tables.

These data tables (even the most basic) would start to overload the history of this main repository (which is more focused on documentation and reference code). So an alternative would create a different repository, share some short URL, and then use that repository as base.

Advantages

make easier for transition between online and offline

By storing the data on another repository, we already can make some minimal checks on the main interface to detect if the user is using something such as http://localhost/numerordinatio instead of https://numerordinatio.etica.ai.. . Not really sure how to handle the files loaded from CDNs (such as the bootstrap CSS and JavaScript) but if loading from localhost, then we could try search the datasets with relative paths from something like:

With data already on a dedicated repository this makes it easier to download and put on an USB stick or something. Also, most ideal users to compile new work in the future may already have help from others to deliver most data already packed.

"offline" access not just for privacy

One reason to have an alternative to local from localhost is not even mere privacy or go full offline, but actually reduce internet usage. Depending of how well optimized the interface becomes, each time a user makes a force reload, this could easily keep downloading from the internet several small files. For example I'm not fully aware how many megabytes all entire world PCodes (without geometry) could take, but this could easily waste a lot of bandwidth.

Another potential advantage of this approach is that for tables which already are not more automated, if a user need to edit something, can do this with an code editor (such as Viscose opening the folder with all datasets) and then reload the main interface to see if abstract syntax tree makes sense.

Disclaimer: on the history of the different repository

The dedicated repository is mostly simplified free static file hosting. The GitHub history may be cleaned from time to time to save space.

Also, even operations which already are not automated (such as using GitHub Actions to pull data from other places) we're likely to commit as bot account such as the @eticaaibot

…924.sh

fititnt · 2022-01-04T04:48:47Z

A lot of some reference files already were done at https://github.com/HXL-CPLP and https://github.com/EticaAI/HXL-Data-Science-file-formats/tree/main/ontologia. Except that reference files started to get too big to store with the HXL-Data-Science-file-formats.

Another major point (which actually is not about code at all) is decision of how to give a numeric number for codes which do not have such. Such reversible algorithm actually would be pretty common need, but this is a future issue.

…caAI/HXL-Data-Science-file-formats/bin/hxl2example)

…dard tools and then hxl2numerordinatio.py become pretty simple

…draft

…sv, 1603.47.15924.tsv

…ning

…yscals

…data cleaning

…n preparation

…dicated one

…repository

fititnt · 2022-05-12T01:26:38Z

Repository https://github.com/EticaAI/n-data renamed to https://github.com/EticaAI/lsf-cache

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

started; see EticaAI/numerordinatio#5

dff6513

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: draft of directory stricture

685cee6

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: 1603.45.49.sh, 1603.47.639.3.sh, 1603.47.15…

2a2bda8

…924.sh

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: hxl2numerordinatio.py started (based on Eti…

16effb4

…caAI/HXL-Data-Science-file-formats/bin/hxl2example)

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: maybe we just use HXLTM syntax plus HXLStan…

918376c

…dard tools and then hxl2numerordinatio.py become pretty simple

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: 1603/47/15924/1603.47.15924.no1.tm.hxl.csv …

cbe457c

…draft

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: 1603/45/49/1603.45.49.no1.tm.hxl.csv draft

23b8544

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: 1603.45.49.no1.tm.hxl.csv

96873df

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 4, 2022

EticaAI/numerordinatio#5: 999999 instead of 99999999 for temporary files

b837078

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 5, 2022

EticaAI/numerordinatio#5: dedicated file for helpers

66bb7f7

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 6, 2022

EticaAI/numerordinatio#5: intermediate conversion tables 1603.45.16.t…

ea7342a

…sv, 1603.47.15924.tsv

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 6, 2022

EticaAI/numerordinatio#5: 1603.45.16.tsv, 1603.47.15924.tsv data clea…

687478c

…ning

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 6, 2022

EticaAI/numerordinatio#5: 1603.47.639.3.tsv

4d7e4b0

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 6, 2022

EticaAI/numerordinatio#5: will need some strategy to deal with many s…

f4e093d

…yscals

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: 999999/1603/47/639/3/1603.47.639.3.hxl.csv …

9497b20

…data cleaning

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: 999999999/999999999.sh changed_recently

fb5a9e8

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: ./999999999/0/2600.py --actionem-cifram draft

b8f8d0a

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: bootstrap_999999_1603_47_639_3_tsv

47386ad

fititnt pushed a commit to EticaAI/lsf-cache that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: EticaAI/multilingual-lexicography-automatio…

393f955

…n preparation

fititnt added a commit to EticaAI/lexicographi-sine-finibus that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: imported scripts from test repository to de…

450416a

…dicated one

fititnt added a commit to EticaAI/lexicographi-sine-finibus that referenced this issue Jan 9, 2022

EticaAI/numerordinatio#5: imported more content from previous ad-hoc …

d75b214

…repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dedicated repository to store examples of pre-compiled datasets #5

Dedicated repository to store examples of pre-compiled datasets #5

fititnt commented Jan 4, 2022

fititnt commented Jan 4, 2022

fititnt commented May 12, 2022

Dedicated repository to store examples of pre-compiled datasets #5

Dedicated repository to store examples of pre-compiled datasets #5

Comments

fititnt commented Jan 4, 2022

Advantages

make easier for transition between online and offline

"offline" access not just for privacy

Disclaimer: on the history of the different repository

fititnt commented Jan 4, 2022

fititnt commented May 12, 2022