Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support query for output record derivation trace. #725

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yjiayu
Copy link
Contributor

@yjiayu yjiayu commented Aug 5, 2020

  1. Add maps to store all debug event entries and extract output record and its input record
    and store these information.
  2. Add helper function to get predecessor operatorId for input record from syntax tree
  3. Add tree-like 'DebuggerRecordNode' structure to store all possible derivation results
    for a given output record with operator id.

@yjiayu yjiayu force-pushed the master-derivation branch from 3f97622 to 3d7ec5c Compare August 5, 2020 23:00
debugger/Main.hs Outdated
d''' <- case confOutputInput of
"" -> return d''
x -> return $ progMirrorInputRelations d'' x
when confDumpFlat $

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add this option to dump the transformed program in the debugger program. If we want to dump it, we can always use the ddlog binary to do it.

debugger/Main.hs Outdated
parseProgram Config{..} = do
fdata <- readFile confDatalogFile
(d, _, _) <- parseDatalogProgram (takeDirectory confDatalogFile:confLibDirs) True fdata confDatalogFile
d'' <- case confOutputInternal of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that we have to ensure that we pass the same options when we compiled the ddlog program and here.
Instead of loading the original program here (and ensuring the same options are passed), I wonder if we can just load the dumped transformed source here? I haven't thought about it deeply yet, so not sure if it's possible to generate DatalogProgram from an ast file?

@@ -0,0 +1,179 @@
{-
Copyright (c) 2018-2020 VMware, Inc.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the convention is for new files, but usually the year here should be the year it was added (i.e., just 2020)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haroldlim , yep

debugger/Main.hs Outdated
s = queryAll events (dbgRecordMap recordMap)
putStr $ show s

queryAll :: [Event] -> DebuggerRecordMap -> String

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, currently the debugger tool will just print/query all derivations of all output records?

Not required for this change, but I wonder if we can make it more interactive (i.e., list out the output records, and then we can specify which output record we want to trace the derivation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for now the debugger tool's function is
queryDerivations :: DebuggerRecord -> DebuggerRecordMap -> DebuggerRecordNode
It takes one output record and debuggerRecordMap (got built when parsing log events) and return the root node of the derivations since the derivation result is a tree like structure.
According to Leonid, we are planning to design a UI that will take the debugger output and then display it in a better way, probably is a next step work.

Copy link
Contributor

@ryzhyk ryzhyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @yjiayu! The overall structure looks good, I think we're on the right track. I have a bunch of comments on specific data structures and algorithms. @yjiayu, we can go through them in detail when you're available.

import Language.DifferentialDatalog.Validate
import Language.DifferentialDatalog.Debug

data TOption = Help
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to copy all flags from the DDlog executable. LibDir and Help seem like the only relevant ones.

@@ -51,6 +51,7 @@ data Config = Config { confDatalogFile :: FilePath
, confDumpDebug :: Bool
, confDumpOpt :: Bool
, confReValidate :: Bool
, confDebugDumpFile :: FilePath
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debugger should have its own Config type, which should include this field.

@@ -182,12 +183,17 @@ insertRHSInspectDebugHooks d rlIdx rule =

debugUpdateRHSRules :: DatalogProgram -> Int -> Rule -> [RuleRHS]
debugUpdateRHSRules d rlIdx rule =
let rhs = debugUpdateRHSRulesWithoutHooks d rlIdx rule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this offline, but just to make sure we're on the same page, with #729 we should not need to perform any kind of preprocessing in the debugger, right?

@@ -0,0 +1,179 @@
{-
Copyright (c) 2018-2020 VMware, Inc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@haroldlim , yep

type DebuggerRecordMap = M.Map DebuggerRecord [Derivation]
type DebuggerRecordWeightMap = M.Map DebuggerRecord Int

data OperatorInput = InputOp OperatorId
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment here would be great.

in case traceRecords of
Nothing -> let updatedDbgRecordMap = M.insert outputRecord [derivation] dbgRecordMap
in DebuggerMaps { dbgRecordMap = updatedDbgRecordMap, dbgRecordWeightMap = updatedDbgRecordWeightMap}
Just derivations -> let updatedDbgRecordMap = M.insert outputRecord (derivations ++ [derivation]) dbgRecordMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will introduce duplicate derivations, and because we do not track individual derivation weights, we won't be able to eliminate mutually canceling derivations.

Just derivations -> let updatedDbgRecordMap = M.insert outputRecord (derivations ++ [derivation]) dbgRecordMap
in DebuggerMaps { dbgRecordMap = updatedDbgRecordMap, dbgRecordWeightMap = updatedDbgRecordWeightMap}

constructRecordMap :: [Event] -> DebuggerMaps -> DatalogProgram -> DebuggerMaps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems nearly identical to handleDebugEvents.

-- Get the operator if for input records in a debug entry
getPredecessorOpId :: OperatorId -> DatalogProgram-> [OperatorInput]
getPredecessorOpId OperatorId{..} DatalogProgram{..} =
let Rule{..} = progRules !! ruleIdx
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash if ruleIdx is out of range. We should not trust input data. One way to deal with this is to run functions that can fail inside MonadError. See, e.g., Validate.hs for examples of this.

if rhsIdx == 0
then [InputRel (atomRelation rhsAtom)]
else let prevRuleRhs = ruleRHS !! (rhsIdx - 1)
in case prevRuleRhs of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the case analysis here should not be whether prevRuleRhs is a literal or not, but rather whether rhsIdx - 1 == 0.

_ -> [InputOp OperatorId {ruleIdx = ruleIdx, rhsIdx = (rhsIdx - 1), headIdx = headIdx}, InputRel (atomRelation rhsAtom)]
RHSCondition{..} -> let prevRhsIdx = getPredecessorRHSRuleIdxForCondition rhsIdx ruleRHS
in [InputOp OperatorId {ruleIdx = ruleIdx, rhsIdx = prevRhsIdx, headIdx = headIdx}]
_ -> [InputOp OperatorId {ruleIdx = ruleIdx, rhsIdx = (rhsIdx - 1), headIdx = headIdx}]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we want to consider two cases, based on whether rhsIdx - 1 == 0. If it is then we want InputRel, otherwise InputOp

1. Add maps to store all debug event entries and extract output record and its input record
   and store these information.
2. Add helper function to get predecessor operatorId for input record from syntax tree
3. Add tree-like 'DebuggerRecordNode' structure to store all possible derivation results
   for a given output record with operator id.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants