Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: add TTL support #25

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

GiedriusS
Copy link

Add TTL support for keys. Implemented for our use case in https://github.com/thanos-io/thanos/ where cached data can potentially change over time. Thus, some kind of TTL mechanism is needed. Implemented it in the same fashion as https://github.com/mailgun/groupcache - deadline for keys is embedded in the value in the LRU cache. On retrieval, the deadline is checked and we are past it then we remove that key.

In HTTP the expiry date is exposed via an extra header. With gRPC, a new field has been added.

Note: this implies that users will have a somewhat synchronized time across nodes in the group. Otherwise, extra fetches might happen or the keys might not ever expire.

GiedriusS and others added 3 commits November 19, 2021 16:27
Add expiration time in preparation for the TTL feature.
TTLs are based on expiration in the LRU - during getting if the value is
expired then it is removed from the LRU and we return nothing. This is
needed to have short-lived values in a distributed system. To achieve
this, expand the `valWithStat` struct with `expire time.Time`. Then,
expand all of the related interfaces:
- Codecs
- `setLRUOnEvicted`
- etc.

Since the HTTP interface returns in the body raw bytes, send the
expiration timestamp with headers. Expand the test in
`http/http_test.go` to test whether the keys are being regenerated after
the timestamp. Copy over a test from mailgun/groupcache for TTLs in the
LRU.
Copy link

@dfinkel dfinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, Thanks for sending this PR!

With that said, I have some mixed feelings about supporting TTLs in Galaxycache in general, and the approach in this PR specifically.

First, my general opinion: Galaxycache works really well because it can treat everything as opaque and immutable. At times, expiration is better handled by changing the cache-key. (e.g. we (Vimeo's Core Services team) wrote this blog post about a scheme we've used for breaking time into epochs (per-key) instead of introducing TTLs to galaxycache.

However, there are still some places where TTLs would be useful, so I think it would be useful to have optional TTL support on some galaxies. (and we can come up with some advice on when to choose a TTL vs cleverly chosen keys)

With respect to this PR, unfortunately, I feel that there are too many breaking changes as-is. (I've noted a few in my comments)

In particular, I think that the Codec interface is a key public interface that would break a lot of client code if it changed much. However, I do like the change to RemoteGetter, I don't expect there to be many implementations of that outside this repository, so returning a new value is reasonable. (although we might want to make that new return-value a struct so we don't have to break the interface every time we add a new field)

I think an improvement would be to instead add a GetWithExpiry method to Galaxy and create an ExpiringBackendGetter interface for users to implement. (that has a GetWithExpiry method instead of a Get method). We'd have galaxycache type-switch so it prefers GetWithExpiry implementations.

With respect to lru.Cache, I think we should copy the existing implementation and add an lru.CacheWithExpiry, so galaxies that have BackendGetter implementations rather than ExpiringBackendGetter implementations wouldn't need to track the additional timestamps.

One thing that one of my teammates pointed out (@arodland ) is that the LRU supporting expiration should track expiring entries so upon eviction we don't evict an unexpired entry if there are any expired entries. (we might need a tree-map or skip-list of some sort to make that efficient for both insertion of arbitrary TTLs and eviction of the oldest one)

@@ -17,6 +17,8 @@ limitations under the License.
syntax = "proto3";

package galaxycachepb;
option go_package = "./galaxycachepb";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this should be a fully-qualified go package path, so github.com/vimeo/galaxycache/galaxycachepb.

@@ -1,212 +1,256 @@
//
//Copyright 2012 Google Inc.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding a license header, but as an auto-generated file, it really doesn't apply.
(also, the galaxycachepb gRPC interface should be copyright Vimeo inc. since that portion wasn't part of galaxycache)

@@ -26,8 +28,9 @@ message GetRequest {
message GetResponse {
bytes value = 1;
double minute_qps = 2;
int64 expire = 3;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I'd rather use the Timestamp -- which translates to timestamppb in Go

Comment on lines +11 to +12
github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks wrong; we don't need gogo protobuf for anything.
perhaps this needs a go mod tidy?

@@ -20,6 +20,7 @@ import (
"context"

gc "github.com/vimeo/galaxycache"
"github.com/vimeo/galaxycache/galaxycachepb"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a second import for the same thing as the one immediately below it.
We should either remove the pb alias and update the other references or use pb for the new reference.

Having two imports for the same thing with different names is confusing.

@@ -55,7 +57,7 @@ func New(maxEntries int) *Cache {
}

// Add adds a value to the cache.
func (c *Cache) Add(key Key, value interface{}) {
func (c *Cache) Add(key Key, value interface{}, expire time.Time) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many uses of this lru.Cache type outside galaxycache. I'd rather not break the interface like this.
If we extend the existing type, it would be best to keep the signature of Add unchanged and add an AddExpiring method.

Comment on lines +1 to +2
#!/bin/bash
protoc --go-grpc_out=. --proto_path=galaxycachepb --go_out=galaxycachepb --go_opt=paths=source_relative galaxycache.proto
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather have this as a //go:generate line in a generate.go in the galaxycachepb package.
proto.sh is a bit non-standard.

Comment on lines +37 to +40
type ByteCodec struct {
bytes []byte
expire time.Time
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change breaks the common use-case of ByteCodec. (same with the other two public Codec implementations in this file)

This change makes it impossible to extract the value out for anyone outside this package.

Comment on lines +24 to +25
MarshalBinary() ([]byte, time.Time, error)
UnmarshalBinary(data []byte, expire time.Time) error
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very disruptive way to side-channel the expiration time around that breaks all existing Codec implementations as well as all BackendGetter implementations.

stats *keyStats
data []byte
stats *keyStats
expire time.Time
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit memory-inefficient to add the expiration time to both to the LRU's value struct and the value we're storing inside the LRU.

@GiedriusS
Copy link
Author

cc @akanshat

@GiedriusS
Copy link
Author

Thank you for your thorough review, we will try the epochs approach and come back to you.

GiedriusS and others added 6 commits June 21, 2022 10:14
Upgrade internal interfaces for multi-key fetch:

- Use multi-part HTTP request/response for retrieving data about
multiple keys. Boundary is randomly generated by Go standard library so
this seems like the best path;
- Upgrade gRPC interface to have `repeat string Keys` instead of `string
Key`. `key` has priority over `keys` but users should use `keys` either
way.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Switch to Go 1.17 && enable golangci-lint.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
* Add `ValueWithTTL` struct for grouping value with time.Time expiration
time;
* Update interfaces to use it;
* Add stub for multiple key retrieval;
* Fix golangci-lint errors.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants