-
Notifications
You must be signed in to change notification settings - Fork 136
Patching to enable building swift-apis on current nightly Swift #1184
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@@ -118,7 +118,7 @@ if(ENABLE_PYTHON_SUPPORT) | |||
GIT_REPOSITORY | |||
git://github.com/pvieito/PythonKit | |||
GIT_TAG | |||
master | |||
6a05a15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
6a05a15 is the last commit before CMake is removed.
@@ -189,7 +189,7 @@ if(NOT X10_FOUND AND NOT USE_BUNDLED_X10) | |||
COMMAND | |||
rm -rf <SOURCE_DIR>/bazel-bin # ${CMAKE_COMMAND} -E rm -Rrf <SOURCE_DIR>/bazel-bin | |||
COMMAND | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow:tensorflow //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enables one to extract a standalone X10 after a successful CMake build, so subsequent builds can be made with the Swift Package Manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. You should be able to do that irrespective. This is just building an additional label, which means that there is potentially a dependency that is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this I think libtensorflow.so.xxx is not generated and I cannot copy them to a standalone /Library/usr/lib
for a SPM build, following the current docs/Development.md
. Interestingly this is not needed for the CMake build, which I don't know the reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the tensorflow core itself is not needed for the build, it is only needed at runtime.
// for (i, x) in zip(lhs.indices, rhs) { | ||
// lhs[i] .*= x | ||
// } | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't get this to work, need investigation
@inlinable | ||
@inline(__always) | ||
@_semantics("autodiff.nonvarying") | ||
public func withoutDerivative<T, R>(at x: T, in body: (T) -> R) -> R { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
withoutDerivative
is removed in language, so copy-pasted to library
@@ -159,8 +159,7 @@ struct Statistics { | |||
} | |||
} | |||
|
|||
@differentiable | |||
public func _defaultLossFunction(_ ŷ: Tensor<Float>, _ y: Tensor<Int32>) -> Tensor<Float> { | |||
@differentiable(reverse)public func _defaultLossFunction(_ ŷ: Tensor<Float>, _ y: Tensor<Int32>) -> Tensor<Float> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME
public func forward(_ input: Tensor<Scalar>) -> Tensor<Scalar> { | ||
let positiveAxis = (input.rank + axis) % input.rank | ||
@differentiable(reverse) | ||
public func callAsFunction(_ input: Tensor<Scalar>) -> Tensor<Scalar> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This place triggers a compiler crash in release due to debug info being generated twice. Core reason should be that the autodiff is not doing proper variable cloning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you isolate this into a standalone test case like Dan and I have done in the past (see some of the previously mentioned SRs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try that after I got more time (have a CVPR deadline)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is the same duplicate debug info validation failure I've been seeing on internal code (and it seems like it, given that it's triggered in release builds with debug info enabled), we started seeing this with Swift commit 9a8f2ed64221635646541757b7e38b3b4e02e4b3. Patching out the two lines in InstOptUtils.cpp that call swift::salvageDebugInfo(inst)
prevents this validation failure, but that's obviously addressing a symptom not the cause. Dan and I had seen some duplicate SIL instructions being generated in the presence of inouts and control flow, but I've had a heck of a time getting a reproducer created. This might provide an excellent clue, so thanks for finding it. I'll take a look at this myself to see if I can minimize something out of it.
|
||
@inlinable | ||
public mutating func replaceSubrange<C>(_ subrange: Range<Self.Index>, with newElements: C) where C : Collection, Self.Element == C.Element { | ||
fatalError("withUnsafeBufferPointer unimplemented because TensorBuffer is abstract") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this shameless copy-pasta. I just can't get this new protocol requirement to work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the additions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MutableCollection
had a recently fixed loophole where you could inherit default conformances that triggered an infinite recursion. You have to provide these yourself now in recent toolchains, which I believe is what is happening here.
@googlebot I signed it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still a large number of places where you have unnecessary whitespace changes. Could you please fix them all and update the diff?
@@ -189,7 +189,7 @@ if(NOT X10_FOUND AND NOT USE_BUNDLED_X10) | |||
COMMAND | |||
rm -rf <SOURCE_DIR>/bazel-bin # ${CMAKE_COMMAND} -E rm -Rrf <SOURCE_DIR>/bazel-bin | |||
COMMAND | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow:tensorflow //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. You should be able to do that irrespective. This is just building an additional label, which means that there is potentially a dependency that is missing?
/// An internal workaround for SR-13263: debug info generation crash. | ||
internal var _sr13263Workaround: SR13263Workaround? | ||
// /// An internal workaround for SR-13263: debug info generation crash. | ||
// internal var _sr13263Workaround: SR13263Workaround? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why comment this out? Is it no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this as-is triggers a compiler crash, so I removed this workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a workaround for an actual issue. I think that it would be better to verify if the underlying issue has been resolved or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying issue has migrated in that you can now trigger the same crash by inserting this placeholder, rather than avoiding the crash by having it. I've seen this with other types. Unclear when the behavior inverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing this happen with or without SR13263Workaround
commented out, on current Swift toolchains.
Nevermind. I reproduced the crash on recent nightly toolchains, using the reproducer from swiftlang/swift#55703.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing the crash affect S4TF, with or without SR13623Workaround
commented out. I'm running swift build
and swift test
, and it's not activating the crash on recent toolchains. Am I doing this right?
export TENSORFLOW_USE_RELEASE_TOOLCHAIN=1
cd s4tf
swift build -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN \
-c release -Xswiftc -g -Xcc -I${DESTDIR}/usr/include -Xlinker -L${DESTDIR}/usr/lib
rm -f .build/release/libx10.dylib
cp ${DESTDIR}/usr/lib/libx10.dylib .build/release/libx10.dylib
swift test -Xswiftc -DTENSORFLOW_USE_STANDARD_TOOLCHAIN \
-c release -Xswiftc -g -Xcc -I${DESTDIR}/usr/include -Xlinker -L${DESTDIR}/usr/lib
@compnerd Sorry for all the whitespace diffs, made a mistake in regex replacement. |
@@ -1297,7 +1297,7 @@ extension Tensor { | |||
} | |||
|
|||
@inlinable | |||
@differentiable(wrt: self where Scalar: TensorFlowFloatingPoint) | |||
// @differentiable(reverse, wrt: self where Scalar: TensorFlowFloatingPoint) | |||
internal subscript(_ indexPath: IndexPath) -> Tensor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were there problems with differentiating through subscripts that led to the removal of the attribute and VJP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it seems that in nightly the where clause is somewhat broken, and the VJP seems to apply to the non-TensorFlowFloatingPoint
case as well, resulting in a protocol non-compliance. I haven't got the time to investigate deeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the where clauses broken in nightlies before or after this PR: swiftlang/swift#39826 ? I'm wondering if that fixed or caused the issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, let me investigate. Thanks Brad :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a manifestation of swiftlang/swift#59135, a bug that appeared in September 2021. All you had to do was comment out the @differentiable(...)
attribute on subscript(_: IndexPath)
. The other declarations - subscript(_: TensorRangeExpression...)
and _vjpSubscript
- are fine.
Co-authored-by: Saleem Abdulrasool <compnerd@compnerd.org>
@rxwei @BradLarson @dan-zheng I tried really hard to isolate the failure case, but failed. I believe this is because this bug only happens where there is some kind of inlining optimization, hence that Release only failure mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this would be easier to review if you split it out into logical commits which add the reverse
modifier to the @differentiable
, renames forward
to callAsFunction
, removes the TangentVector
, etc. Theoretically those would have happened individually when the change was made upstream.
@@ -189,7 +189,7 @@ if(NOT X10_FOUND AND NOT USE_BUNDLED_X10) | |||
COMMAND | |||
rm -rf <SOURCE_DIR>/bazel-bin # ${CMAKE_COMMAND} -E rm -Rrf <SOURCE_DIR>/bazel-bin | |||
COMMAND | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility | |||
bazel build ${VISIBILITY_FLAGS} -c opt --define framework_shared_object=false //tensorflow:tensorflow //tensorflow/compiler/tf2xla/xla_tensor:x10 --nocheck_visibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, the tensorflow core itself is not needed for the build, it is only needed at runtime.
/// An internal workaround for SR-13263: debug info generation crash. | ||
internal var _sr13263Workaround: SR13263Workaround? | ||
// /// An internal workaround for SR-13263: debug info generation crash. | ||
// internal var _sr13263Workaround: SR13263Workaround? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a workaround for an actual issue. I think that it would be better to verify if the underlying issue has been resolved or not.
|
||
@inlinable | ||
public mutating func replaceSubrange<C>(_ subrange: Range<Self.Index>, with newElements: C) where C : Collection, Self.Element == C.Element { | ||
fatalError("withUnsafeBufferPointer unimplemented because TensorBuffer is abstract") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the additions?
I think that this is rather an inappropriate forum for this. I don't particularly consider this PR as stale btw, there are still items to be resolved and investigated before it can be merged.
The project doesn't seem particularly enticing to me - the "rebuilding from the ground up for GPU acceleration through Metal" is a big component of that. If you were designing it for optimization for DCOM+ (DirectX is effective accessed through COM), that might be different. GPU environments are quite fragmented and designing around a particular one is a choice that I do not particularly agree with. This also seems very much as other platforms are an afterthought. The overall structure of S4TF is far better from a design standpoint - it generically solves the problem in software and has abstractions over a compilation pipeline to separate the concern of optimization and device/runtime independence. Additionally, I don't think that you realize the complexity of using ArrayFire. That codebase has a core in C++ - which means that it is a non-starter. Until the underlying ABI issues are resolved, C++ interop cannot be reliably used. |
The S4TF team archived this project in February and said they were no longer accepting pull requests to it. Merging this PR would seem to be in violation of Google’s policy. Furthermore, this PR is based on a branch called “resurrection", implying that it is an effort to un-archive S4TF. When I made that comment, I had only skimmed through the edits and got the wrong impression that @ProfFan was trying to bring S4TF out of archive mode. I thought that because of his intention, you were never going to accept the PR. I apologize for this misunderstanding. |
If Swift for TensorFlow is open for changes again, I will gladly help you out with this pull request. I have plenty of time, which is something all of you seem to greatly lack. I assume that’s why this PR has been inactive for 3 weeks. Could you outline what must be done before this pull request is approved? @rxwei @BradLarson @compnerd I have more to say on this topic and refactored it into an issue to prevent cluttering this PR. Could you please take a look at that? |
What bugs specifically are blocking this swift-apis PR? I’m trying to build this repo on Linux by redoing the build system, and I need to know exactly where autodiff will cause trouble. What workarounds can I do for each of the bugs? I have to use an official Swift release toolchain, and I’m dealing with the bugs present in Swift 5.5.2. Also, any explanation for this? |
@ProfFan @rxwei @dan-zheng @BradLarson someone please help me. Please bear with my rushed tone and run-on sentences; I have no way to continue working on S4TF because of this problem. I have tried building S4TF, but all of the solutions are failing. Bazel fails because it tries to dowload an arm64 version of Bazel 3.7.2 (only x86_64 exists). I have Bazelisk installed on my system and can't find out how to get it off. I suspect that if I could uninstall Bazelisk, I could force Bazel to download as something that works. Then, I tried just using CMake instead of SwiftPM. According to a comment in @ProfFan's PR, I could then use SwiftPM to re-build the non-X10 stuff afterward. I'm trying to rip out X10 and build a new backend in its place (possibly named At this point, I'm thinking of modifying S4TF to alternatively build on a release toolchain, using my Differentiation package. I can't do that yet because I have no way to reproduce the original build process. I also can't download the pre-built X10 binary and run it from Rosetta because I'm building from a personal fork with this PR's commits merged into it. @ProfFan modified something in the X10 directory(ies), and the existing binaries don't have these changes. If you have an Apple silicon Mac, could you send me the arm64 binaries you built in a package structured similarly to the pre-built x64 macOS X10 binaries? |
@philipturner - Regarding blocking compiler crashers, one was recently fixed, but two still remain (the 12/23/2021 nightly shapshot is the first toolchain to include the linked fix). The two remaining crashers are both due to failing assertions around debug symbols, one that triggers around mutating functions and control flow, and the other due to an optimization pass behaving oddly during release builds. The former can be worked around by moving control flow out of any mutating function that triggers the assertion failure (BatchNorm was one place we hit that here), and the latter can be avoided by disabling the generation of debug symbols when building for release. Both can be prevented if you build a non-assertion-enabled toolchain. Fixing these has been slow going, because I've had a hard time creating a concrete reproducer for the first (I do have SIL dumps, which points to a possible cause, but I need a failing test to verify a fix), and the latter only occurs under very specific optimization conditions. I do know the exact commit and lines that introduced the latter failure, but not the root cause. By "release toolchain", do you mean the ones integrated with Xcode, or the swift.org release toolchains? In either case, the 5.5 branch does not have some necessary autodiff fixes in it that have been put in place over the last year (the branching for that was a while ago), and I know that there was another compiler crasher that directly blocked tensorflow/swift-apis from building which has been fixed in main and is present in the nightly toolchain builds. I don't think switching to a release toolchain (Xcode or swift.org) will help you much in trying to build the project. If you're getting errors of "no such module _Differentiation", that sounds like the build process is still trying to use the default Swift Xcode toolchain and not one from Swift.org, so you could try to manually place the path to the correct toolchain in your PATH ahead of the Xcode toolchain in order to force the use of the right toolchain. Regarding building TensorFlow, getting Bazel set up can be a bit of a pain, and building TensorFlow from source can take as long as building the entire Swift toolchain. If you're trying to build this on an M1, I'm not sure that all even works with the version of TensorFlow you'd need to build to connect to the old Swift for TensorFlow interfaces (2.4.0 if I'm reading things right). I think you'll have to build that on X86_64. You may also need to match up the exact version of Bazel that had been used to build TensorFlow (somewhere between 3.7.2 and 3.99.0, but it sounds like you've got the correct 3.7.2). |
@BradLarson I'm continuing this conversation in the "requesting assistance" issue to prevent cluttering this PR. |
start with x86 rather than jump to m1! |
I have a few hours tomorrow morning where I can access my Intel Mac mini. Hopefully, I can bridge from the Swift dev toolchain to release within that timeframe, then proceed on Colab (also x86) for the rest of the day.
…Sent from my iPhone
On Jan 3, 2022, at 10:33 PM, brett koonce ***@***.***> wrote:
start with x86 rather than jump to m1!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
@philipturner wrt to #1185 (comment) + going off of what brad was saying, you should make sure you have the same cmake version as well! |
@ProfFan (or @compnerd or @BradLarson if you reproduced his steps while reviewing in October) could you let me know which Swift nightly snapshot was used while building this fix? I'm having trouble reproducing your results on Linux. I attached the Colab notebook where I'm facing issues right now. It's untidy, but hopefully you can use it to reproduce my conditions (please run on Google Colab and read the comments). If you'd rather not run it, I downloaded CMake 3.22.1, Bazel 3.1.0, and took the CMake route to compiling with a pre-compiled Ubuntu x64 binary. I pulled source code from ProfFan's I'm facing problems with it freezing after logging the command below. I faced a similar silent freezing error when attempting to build S4TF on my Mac mini (with a different cause), and I later found that the program had actually stopped and would log the error if run slightly differently. I can't figure out how to reveal that error in the Colab environment though.
On the Swift releases website, it doesn't show snapshots earlier than November. Yet, these patches were made in October. I just need the date of the snapshot used, and the snapshot from then might still be downloadable. This is all to narrow down the cause of the error and reproduce the exact same environment that @ProfFan somehow successfully built S4TF in. |
I got S4TF to build successfully once in a Colab environment, although I used a pre-compiled x10. |
@ProfFan there was a build failure on the 2021-12-23 snapshot that didn't occur on the 2021-11-12 snapshot. This should be worth investigating to add commits to your PR. The crash log is in one of the later cells of this notebook: The crash log was related to I also tested it on 2022-01-06, with the same result. |
If I'm reading these notebook results correctly, it looks like the 12/23/2021 nightly snapshot toolchain got a lot farther in building than the 11/12/2021 one. I vaguely remember being able to use the 2021-07-07 nightly (downloadable from here for X86 Ubuntu 18.04) to build this branch before, but I could be wrong about that. That was a good, stable snapshot for autodiff for a while. Looking at the failure on the 12/23/2021 toolchain, it appears that the RequirementMachine is unable to solve out the TangentVector for RecurrentLayer (type defined at swift-apis/Sources/TensorFlow/Layers/Recurrent.swift, lines 370 - 469). If that could be minimized, it'd be great to report upstream for Richard and / or Slava to look at. If this persists with the just-released 1/6/2022 toolchain, let me know and I can minimize and report that. As a workaround, you might be able to define a custom TangentVector for RecurrentLayer that explicitly states what the compiler is currently generating for that TangentVector and that might help the solver here. |
2021-11-12 actually completed the build (in fact, it was the only toolchain that succeeded so far). I had some problems with it previously, but they were my fault and I resolved them. I didn’t verify whether it compiles on 2021-07-07. The problem persists in the 2022-01-06 toolchain. Below is another Colab notebook that ran on 2021-01-06. Is this sufficient content for you to generate your bug report? |
@BradLarson could you give super precise details on how to implement the workaround? I anticipate that my lack of experience with differentiation will unnecessarily waste time if I tackle the problem alone. Plus, it takes 2 minutes to reach the build failure, unlike Xcode where you can cache build products (I might change that by moving to my Mac mini if this gets serious enough). |
I haven't verified that this will compile, but you'll want to add something like the following before line 376 in Recurrent.swift: public struct TangentVector: Differentiable, AdditiveArithmetic {
public typealias TangentVector = RecurrentLayer.TangentVector
public var cell: Cell.TangentVector
public init(cell: Cell.TangentVector) {
self.cell = cell
}
}
mutating func move(by offset: TangentVector) {
self.cell.move(by: offset.cell)
} That should provide what you need to satisfy a manual TangentVector for the RecurrentLayer type. Again, not entirely sure that'll compile, but something close to that is what you'd need. |
Thanks! That’s what I was asking for! And any progress/insights on the bug with VJP pow derivative? (I assigned it to you on JIRA)
…Sent from my iPhone
On Jan 7, 2022, at 9:18 PM, Brad Larson ***@***.***> wrote:
@BradLarson could you give super precise details on how to implement the workaround?
I haven't verified that this will compile, but you'll want to add something like the following before line 376 in Recurrent.swift:
public struct TangentVector: Differentiable, AdditiveArithmetic {
public typealias TangentVector = RecurrentLayer.TangentVector
public var cell: Cell.TangentVector
public init(cell: Cell.TangentVector) {
self.cell = cell
}
}
mutating func move(by offset: TangentVector) {
self.cell.move(by: offset.cell)
}
That should provide what you need to satisfy a manual TangentVector for the RecurrentLayer type. Again, not entirely sure that'll compile, but something close to that is what you'd need.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
@BradLarson the crash still occurs (2022-01-06), even with your workaround. It says this in the crash log:
I noticed that it says It happened at this place (around line 440), which doesn't have a manually defined derivative. Will manually defining a derivative fix it, and what code should I put as the derivative? @differentiable(reverse)
public func callAsFunction(_ inputs: [Cell.TimeStepInput]) -> [Cell.TimeStepOutput] {
let initialState = withoutDerivative(at: cell.zeroState(for: inputs[0]))
return self(inputs, initialState: initialState)
} Colab notebook: |
This comment has been minimized.
This comment has been minimized.
You have to click "Review" after you make review comments. |
This comment has been minimized.
This comment has been minimized.
* Update Normalization.swift
@ProfFan do any of your current projects depend on S4TF? If so, wouldn't your work be impacted by the new crash that appeared in 2021-12-23? |
No I am not doing any S4TF in my current work |
This comment has been minimized.
This comment has been minimized.
I have no bandwidth for this now. Maybe I will have time after 2 months. |
Currently this builds in
Debug
on the latest nightly toolchain on Linux. macOS is untested.All unit tests pass except one, which is not crucial (only a floating point difference of 1e-6).
Building on
Release
is blocked by swiftlang/swift#38745 (comment)