Conversation
s-matyukevich
left a comment
There was a problem hiding this comment.
Looks good, but I have a lot of minor comments.
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Outdated
Show resolved
Hide resolved
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Outdated
Show resolved
Hide resolved
|
|
||
| if (addresses.size() <= config.subsetSize) return allAddresses; | ||
| if (config.sortAddresses) { | ||
| // If we sort, we do so via destination hashcode. This is deterministic but differs from the goland instrumentation. |
There was a problem hiding this comment.
*golang
I guess we should align on this, not sure how realistic this is, but in theory people may want to combine go a java clients in a single ring.
There was a problem hiding this comment.
Agreed, we should align. Do you know how golang does the comparison? How does it determine one address is "less than" or "greater than" another?
There was a problem hiding this comment.
I looked into this myself. It looks like Address.Addr is a string, so golang is using simple string comparison. SocketAddress does not have an equivalent, so we'd have to call .toString() which isn't guaranteed to produce the same result. I think if we want to align, we should adjust on the golang side.
There was a problem hiding this comment.
so we'd have to call .toString() which isn't guaranteed to produce the same result
Why do you think so? String representation of an IP should be the same in all languages, but hashes could be different.
There was a problem hiding this comment.
Because if toString() is not overridden java will print out the class description. For example, the FakeSocketAddress we use here overrides toString() to be the name of the socket address, so we see [[FakeSocketAddress-server0]/{}]. Without that we get [[io.grpc.util.OutlierDetectionLoadBalancerTest$FakeSocketAddress@5ed4bc]/{}]. SocketAddress is abstract and does not have any valuable information for toString to be meaningful, and we can't guarantee all of it's extending classes override toString. Maybe that's still safer than hashcode, but there's no guarantee.
There was a problem hiding this comment.
Yes, but can we implement a custom Comparator, which will use string representation of an IP, and use it for sorting? IMO it is much better to add to gRFC that addresses must be sorted by IP rather than by some type of hash and define what hashing function should be used.
There was a problem hiding this comment.
Oh, sorry, I got the problem now, so what you are saying is that there is no way to safely get a string representation of an instance of SocketAddress in java? I guess that outside tests it will always be an instance of InetSocketAddress, which should correctly override toString() and the way FakeSocketAddress overrides it is also fine for tests. Can we receive an instance of any other class from the resolver?
There was a problem hiding this comment.
Yes that's the problem. InetSocketAddress should be the most common, yeah, but the expectations will break if folks use a custom impl of SocketAddress. In general i'm wary of making assumptions like these because we are accepting any SocketAddress, not only InetSocketAddresses, but I suppose InetSocketAddress is well enough established at this point that it's acceptable to do this and just make sure the assumption is well documented.
|
|
||
| ArrayList<EquivalentAddressGroup> eaglist = new ArrayList<>(); | ||
|
|
||
| for (EquivalentAddressGroup eag : allAddresses.getAddresses()) { |
There was a problem hiding this comment.
In what cases EquivalentAddressGroup can contain more then 1 address? I think if this happens it will break the current implementation. Maybe we should check for this case and return an error? Or we probably can construct eaglist directly from subset.
There was a problem hiding this comment.
Why do you think this would break the current implementation? There is no guarantee that it will contain only one address (if there is, we should convince them to remove the EAG because it honestly is just overcomplicating their entire implementation)
There was a problem hiding this comment.
I looked more into it and I think we should treat every EquivalentAddressGroup as a single address.
- The description of this class says the following
A group of SocketAddresses that are considered equivalent when channel makes connections.
Usually the addresses are addresses resolved from the same host name, and connecting to any of them is equally sufficient. They do have order. An address appears earlier on the list is likely to be tried earlier.
- Looks like RoundRobin LB creates a sub-channel per EAG code
- DNS resolver creates a EAG per resolved address code
I didn't figure out the use-case when EAG is useful, I think it is used in cases when a resolver can provide multiple addresses to connect to the same host, but I don't understand why this could be useful. Still, looks like the rest of the code treats EAG as a single address and probably should do the same.
There was a problem hiding this comment.
Hm okay, I'll make this change because I agree it is more in line with what they have, but I will also file an issue/question with upstream regarding its purpose. To me it honestly looks like it exists solely because it would be annoying to remove.
There was a problem hiding this comment.
Ah looks like there is an intended purpose: grpc#5263 (comment)
There was a problem hiding this comment.
Good find, this make sense. I think it confirms that we should treat EAG as a single address.
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Show resolved
Hide resolved
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Outdated
Show resolved
Hide resolved
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Outdated
Show resolved
Hide resolved
util/src/main/java/io/grpc/util/DeterministicSubsettingLoadBalancer.java
Outdated
Show resolved
Hide resolved
util/src/test/java/io/grpc/util/DeterministicSubsettingLoadBalancerTest.java
Outdated
Show resolved
Hide resolved
util/src/test/java/io/grpc/util/DeterministicSubsettingLoadBalancerTest.java
Show resolved
Hide resolved
6efa9ee added `volatile` to `attributes` after TSAN detected a data race that was added in 91d15ce. The race was because attributes may be read from another thread after `transportReady()`, and the post-filtering assignment occurred after `transportReady()`. The code now filters the attributes separately so they are updated before calling `transportReady()`. Original TSAN failure: ``` Read of size 4 at 0x0000cd44769c by thread T23: #0 io.grpc.netty.NettyClientHandler.getAttributes()Lio/grpc/Attributes; NettyClientHandler.java:327 #1 io.grpc.netty.NettyClientTransport.getAttributes()Lio/grpc/Attributes; NettyClientTransport.java:363 #2 io.grpc.netty.NettyClientTransport.newStream(Lio/grpc/MethodDescriptor;Lio/grpc/Metadata;Lio/grpc/CallOptions;[Lio/grpc/ClientStreamTracer;)Lio/grpc/internal/ClientStream; NettyClientTransport.java:183 grpc#3 io.grpc.internal.MetadataApplierImpl.apply(Lio/grpc/Metadata;)V MetadataApplierImpl.java:74 grpc#4 io.grpc.auth.GoogleAuthLibraryCallCredentials$1.onSuccess(Ljava/util/Map;)V GoogleAuthLibraryCallCredentials.java:141 grpc#5 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Lcom/google/auth/oauth2/OAuth2Credentials$OAuthValue;)V OAuth2Credentials.java:534 grpc#6 com.google.auth.oauth2.OAuth2Credentials$FutureCallbackToMetadataCallbackAdapter.onSuccess(Ljava/lang/Object;)V OAuth2Credentials.java:525 ... Previous write of size 4 at 0x0000cd44769c by thread T24: #0 io.grpc.netty.NettyClientHandler$FrameListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V NettyClientHandler.java:920 #1 io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onSettingsRead(Lio/netty/channel/ChannelHandlerContext;Lio/netty/handler/codec/http2/Http2Settings;)V DefaultHttp2ConnectionDecoder.java:515 ... ```
As discovered by TSAN, the adsStream field is not synchronized.
```
WARNING: ThreadSanitizer: data race (pid=1625)
Read of size 4 at 0x00009b66fc88 by thread T23 (mutexes: write M0):
#0 io.grpc.xds.ControlPlaneClient.isReady()Z ControlPlaneClient.java:203
#1 io.grpc.xds.ControlPlaneClient.readyHandler()V ControlPlaneClient.java:211
#2 io.grpc.xds.ControlPlaneClient$AdsStream.onReady()V ControlPlaneClient.java:328
grpc#3 io.grpc.xds.GrpcXdsTransportFactory$EventHandlerToCallListenerAdapter.onReady()V GrpcXdsTransportFactory.java:145
grpc#4 io.grpc.PartialForwardingClientCallListener.onReady()V PartialForwardingClientCallListener.java:44
grpc#5 io.grpc.ForwardingClientCallListener.onReady()V ForwardingClientCallListener.java:23
grpc#6 io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onReady()V ForwardingClientCallListener.java:40
grpc#7 io.grpc.PartialForwardingClientCallListener.onReady()V PartialForwardingClientCallListener.java:44
grpc#8 io.grpc.ForwardingClientCallListener.onReady()V ForwardingClientCallListener.java:23
grpc#9 io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onReady()V ForwardingClientCallListener.java:40
grpc#10 io.grpc.internal.DelayedClientCall$DelayedListener.onReady()V DelayedClientCall.java:497
grpc#11 io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamOnReady.runInternal()V ClientCallImpl.java:781
grpc#12 io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamOnReady.runInContext()V ClientCallImpl.java:772
grpc#13 io.grpc.internal.ContextRunnable.run()V ContextRunnable.java:37
grpc#14 io.grpc.internal.SerializingExecutor.run()V SerializingExecutor.java:133
grpc#15 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V ThreadPoolExecutor.java:1130
grpc#16 java.util.concurrent.ThreadPoolExecutor$Worker.run()V ThreadPoolExecutor.java:630
grpc#17 java.lang.Thread.run()V Thread.java:830
grpc#18 (Generated Stub) <null>
Previous write of size 4 at 0x00009b66fc88 by thread T4 (mutexes: write M1, write M2, write M3, write M4, write M5):
#0 io.grpc.xds.ControlPlaneClient$AdsStream.cleanUp()V ControlPlaneClient.java:424
#1 io.grpc.xds.ControlPlaneClient$AdsStream.close(Ljava/lang/Exception;)V ControlPlaneClient.java:418
#2 io.grpc.xds.ControlPlaneClient$1.run()V ControlPlaneClient.java:130
grpc#3 io.grpc.SynchronizationContext.drain()V SynchronizationContext.java:94
grpc#4 io.grpc.SynchronizationContext.execute(Ljava/lang/Runnable;)V SynchronizationContext.java:126
grpc#5 io.grpc.xds.XdsClientImpl.shutdown()V XdsClientImpl.java:207
grpc#6 io.grpc.xds.SharedXdsClientPoolProvider$RefCountedXdsClientObjectPool.returnObject(Ljava/lang/Object;)Lio/grpc/xds/XdsClient; SharedXdsClientPoolProvider.java:144
grpc#7 io.grpc.xds.SharedXdsClientPoolProvider$RefCountedXdsClientObjectPool.returnObject(Ljava/lang/Object;)Ljava/lang/Object; SharedXdsClientPoolProvider.java:102
grpc#8 io.grpc.xds.XdsClientFederationTest.cleanUp()V XdsClientFederationTest.java:86
```
No description provided.