You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once the full text-reuse clusters have been generated and all works as intended with passim v1, it would be interesting to also perform this detection with the new python version v2, because:
Python version v2 is more recent and currently being maintained, while v1 is old and not maintained anymore
Staying on python instead of having various new dependencies with java, spark, scala etc is simpler in terms of the project's sustainability
The python version does not require the first step of boilerplate detection, which could mean a much faster process.
Hence, based on the results, it might be relevant and useful to switch to the python version in the mid-long-term.
The action points are:
Recompute the text-reuse with the updated version
Compute statistics & visualizations to compare the results with the old version computed on new data
Optionally change the approach for future text-reuse processings and/or look into how to match the scala performance with the python version
Document process when using the python version
The text was updated successfully, but these errors were encountered:
Once the full text-reuse clusters have been generated and all works as intended with passim v1, it would be interesting to also perform this detection with the new python version v2, because:
Hence, based on the results, it might be relevant and useful to switch to the python version in the mid-long-term.
The action points are:
The text was updated successfully, but these errors were encountered: