Skip to content

Latest commit

 

History

History
26 lines (25 loc) · 1.24 KB

refactoring-steps.md

File metadata and controls

26 lines (25 loc) · 1.24 KB

Preparation phase

  • setup of the local dev environement
  • create a copy of the notebook
  • ensure that the current notebook run without errors
  • identify code smells : [X] dead code (executed code with a result never reused) => print, display, show, df.printSchema ... [X] exposing explicity implementation details [X] duplication [X] magic command (%sql, %scala, %python) [X] to many comments ..etc
  • convert the notebook into a python file
  • remove dead codes
  • group codes if possible into 3 sections (global functions): extract(), transoform(), load()
  • resolve dependencies between sections (extract(), transoform(), load()) and ensure that the notebook run successufily
  • Start the refactoring phase of the 3 sections (one by one)

Refactoring phase

  • run a characterisation test (pytest-watch)
  • do()
    • identify a block of code that can be exported to a python module
    • write the test for the python module
    • write the python module
    • make the test pass (read and analyze continuous feedback from pytest-watch)
    • use the python function in the main code
    • commit the last changes
    • refactor again ...