JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.#591
Conversation
| import static org.testng.Assert.fail; | ||
|
|
||
| public class AtomicMonotonicTimestampGeneratorTest { | ||
| public class AtomicMonotonicTimestampGeneratorTest extends CCMTestsSupport { |
There was a problem hiding this comment.
Could we keep it unit test only? It pains me that we have to spin up a Cassandra cluster to test such a simple component.
|
Open question: would there be some value in leaving the old behavior as an alternative? One issue with a drifting generator is that you could loose the real-time ordering between multiple clients if only one of them has drifted in the future. Although this was possible as well with the previous implementations... At least we should document it. Speaking of documentation, we should also update the javadoc of the two implementations, and probably briefly mention the drifting behavior in the manual ( |
| * and adds the number of nanoseconds since the last call to get the current time, which is good | ||
| * enough accuracy for our purpose (see CASSANDRA-6106). | ||
| */ | ||
| class NativeClock implements Clock { |
There was a problem hiding this comment.
This comes from Sylvain Lebresne's suggestion in CASSANDRA-6106.
I very much doubt that two clients would be in sync enough to be able to order their statements so that the real last one wins. But if everyone agrees, I can leave the old behavior as an alternative.
Yes good catch, I will add a note on that. |
| * Base implementation for generators based on {@link System#currentTimeMillis()} and a counter to generate | ||
| * the sub-millisecond part. | ||
| */ | ||
| abstract class AbstractMonotonicTimestampGenerator implements TimestampGenerator { |
There was a problem hiding this comment.
I fixed the filename which wasn't in sync with the class name.
|
Switched to JNR call to
The call costs avg. 200-300 ns in my benchmarks where a call to Complete benchmark results under Linux: Note: The "approx" benchmarks use Sylvain's method. |
|
Squashed and rebased |
| * <p/> | ||
| * On Linux systems, however, it is possible to have a better granularity by using a JNA | ||
| * call to {@code clock_gettime}. To benefit from this system call, set the system | ||
| * property {@code com.datastax.driver.USE_NATIVE_CLOCK} to {@code true}. |
There was a problem hiding this comment.
It's now set to true by default.
There was a problem hiding this comment.
Good catch, will amend, thanks.
|
All remarks addressed, thanks for your feedback. |
|
Squashed again in preparation for JAVA-444, sorry for the inconvenience. |
|
Rebased on current 2.1 (2708bd7) |
|
Squashed and rebased on top of current 2.1. |
|
Rebased again, with a few minor changes (mainly doc). |
| * @return the next timestamp to use | ||
| */ | ||
| protected long computeNext(long last) { | ||
| long current = clock.currentTimeMicros(); |
There was a problem hiding this comment.
Could there be value in adding a warning statement if we detect current being 1000ms less than last? That could indicate a clock drift, which might be worth logging (although we wouldn't want to spam it(.
| libc = LibraryLoader.create(LibC.class).load("c"); | ||
| runtime = Runtime.getRuntime(libc); | ||
| } catch (Throwable t) { | ||
| LOGGER.info("Could not load JNR LibC Library, native calls will not be available"); |
There was a problem hiding this comment.
it would be nice to include the throwable stack (maybe at DEBUG only though?) So it's easier to understand why it couldn't be loaded. In one case I had a classloader issue which wasn't obvious so it would be good to have a way to expose the exception.
|
The OSGI example suite was failing because jnr was missing, I created java727-osgi (commit 9e6e7b6) which may resolve this. However, the solution isn't that simple because some modules JNR depends on are not configured as OSGi modules. These are:
When building & installing the maven modules w/ those PRs (and removing SNAPSHOT from the versions), the OSGi tests pass. I think if we want OSGi work in the near term, we should make the JNR dependency optional and have the driver OSGi module not import their packages. If those PRs get merged and versions with them released, we could add JNR back. |
|
Thank you very much @tolbertam for catching the OSGi problem. After reviewing your branch another possible solution came to me: Actually JNR is de facto optional for the driver: if the dependency is missing, the driver falls back to SystemClock automatically. So my idea was to make it optional for OSGi bundles only. See here. The OSGi tests pass with this configuration, and we do see the warning: I also cherry-picked your changes to the OSGi application. |
|
Nice @adutra! That was not something I was aware was possible. Given this and that JNR cannot work in OSGi without the aforementioned PRs against jffi and jnr-x86asm, do we even need |
Right now, no; but my idea was that if one day those PRs are merged and OSGi bundles are available for all JNR libraries, then people could start including JNR in their OSGi runtimes and the driver bundle would simply detect that the package is present and activate it. |
Got it, that makes sense, thanks for clearing that up. I need to look into whether or not it will actually work with just |
|
Looks like it doesn't work, because it doesn't pull in jnr.ffi.provider into Import-Packages for some reason, but it does pull in others: |
|
This is what I needed to do to support JNR in OSGi while making it optional at the same time, seems to work on my branch: Shaded Imports Unfortunately all of those packages must be explicitly listed as wildcarding doesn't seem to work (it must have something to do with how things are loaded). |
That's unfortunate :( shouldn't we simply exclude JNR from the OSGi bundle right now ( |
Yeah, definitely. Have tried a number of permutations and that's the only one I can seem to make work both ways.
I'm 👍 on that idea if it works. Making it explicit that we don't import it should hopefully make it clear to users that we don't support using JNR with OSGi. |
|
It looks like !jnr.* works, so I'm 👍 on that. |
| long lastWarning = lastClockDriftWarning.get(); | ||
| if (now > lastWarning + CLOCK_DRIFT_WARNING_INTERVAL && lastClockDriftWarning.compareAndSet(lastWarning, now)) { | ||
| if (current == last) { | ||
| LOGGER.warn( |
There was a problem hiding this comment.
Have you noticed when running the OSGi tests that it occasionally logs:
[main] WARN com.datastax.driver.core.TimestampGenerator - Clock drift detected: same timestamp was generated twice (1457112131130000), returned timestamps will be artificially incremented to guarantee monotonicity. This message will only be logged once every second.
With a system clock this seems very likely since you can have concurrent access in the same millisecond. I'm wondering if we should consider removing this particular warning and depending on the 'Clock skew' one instead.
There was a problem hiding this comment.
That, or make lastClockDriftWarning static maybe?
There was a problem hiding this comment.
making it static does seem like a good idea. I think this will still happen though, because clock.currentTimeMicros() for SystemClock is only going to have millisecond resolution and the likelihood of multiple calls in an ms seems very high. The previous implementation only printed a warning if 999 entries were created for the same millisecond, so it was much less sensitive to this.
Done! |
|
The clock drift warning condition is too aggressive: with a simple load test I can reproduce "clock drift" almost every second, and even "clock skew" regularly. This is with the microsecond-resolution clock; with |
|
Pushed PoC for a "drift threshold" (some javadocs missing, will complete later). |
|
The threshold improves things, but I'm more and more thinking that issuing warnings might not be the best approach. We could also provide a new Metrics meter, e.g. a Timer that would record the N last abnormal deltas (i.e. deltas for which last - current <= 0, with maybe a threshold other than zero to filter out minor skews). This way users could query this metrics to find out if there were skews (timer.getCount() > 0) and what was the average delta for them (timer.getSnapshot().getMean()). |
|
Metrics complicate things though, because we'd have to link timestamp generators to the |
Sounds like a good compromise. |
|
New commit based on previous discussion. |
|
Pushed a small commit that reorganizes the documentation among the class hierarchy; hope that's ok otherwise I can revert it. |
|
Looks great to me now 👍 |
… + use microsecond precision when possible. This commit introduces a new dependency to JNR, and uses the system call 'gettimeofday' when available to generate timestamps with microsecond precision.
JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.
|
But how to make the JNR optional for OSGi bundles only?? PLEASE HELP ME! |
No description provided.