Skip to content

JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.#591

Merged
olim7t merged 1 commit into2.1from
java727
Mar 9, 2016
Merged

JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.#591
olim7t merged 1 commit into2.1from
java727

Conversation

@adutra
Copy link
Copy Markdown
Contributor

@adutra adutra commented Feb 11, 2016

No description provided.

import static org.testng.Assert.fail;

public class AtomicMonotonicTimestampGeneratorTest {
public class AtomicMonotonicTimestampGeneratorTest extends CCMTestsSupport {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we keep it unit test only? It pains me that we have to spin up a Cassandra cluster to test such a simple component.

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Feb 11, 2016

Open question: would there be some value in leaving the old behavior as an alternative? One issue with a drifting generator is that you could loose the real-time ordering between multiple clients if only one of them has drifted in the future. Although this was possible as well with the previous implementations... At least we should document it.

Speaking of documentation, we should also update the javadoc of the two implementations, and probably briefly mention the drifting behavior in the manual (manual/query_timestamps/).

* and adds the number of nanoseconds since the last call to get the current time, which is good
* enough accuracy for our purpose (see CASSANDRA-6106).
*/
class NativeClock implements Clock {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from Sylvain Lebresne's suggestion in CASSANDRA-6106.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Feb 11, 2016

Open question: would there be some value in leaving the old behavior as an alternative? One issue with a drifting generator is that you could loose the real-time ordering between multiple clients if only one of them has drifted in the future.

I very much doubt that two clients would be in sync enough to be able to order their statements so that the real last one wins. But if everyone agrees, I can leave the old behavior as an alternative.

Speaking of documentation, we should also update the javadoc of the two implementations, and probably briefly mention the drifting behavior in the manual (manual/query_timestamps/).

Yes good catch, I will add a note on that.

@adutra adutra added this to the 2.1.10 milestone Feb 12, 2016
* Base implementation for generators based on {@link System#currentTimeMillis()} and a counter to generate
* the sub-millisecond part.
*/
abstract class AbstractMonotonicTimestampGenerator implements TimestampGenerator {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the filename which wasn't in sync with the class name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Feb 15, 2016

Switched to JNR call to gettimeofday as it has at least 2 advantages:

  • It is 10x faster than a JNA call to clock_gettime
  • It is available under more systems (including Mac OS - not tested under Windows)

The call costs avg. 200-300 ns in my benchmarks where a call to System.currentTimeMillis is 30-50 ns, so it is still more expensive. To compensate, I kept the updating system initially designed by Sylvain in CASSANDRA-6106. That mechanism brings the cost of a call to Native.currentTimeMicros to 30-50 ns, that is, equivalent to System.currentTimeMillis.

Complete benchmark results under Linux:

Benchmark                     Mode  Cnt     Score     Error  Units
ClockBenchmark.testCurMillis  avgt   20    48.024 ±   1.805  ns/op
ClockBenchmark.testNanoTime   avgt   20    38.167 ±   2.352  ns/op
ClockBenchmark.testJNA        avgt   20  4028.256 ± 166.911  ns/op
ClockBenchmark.testJNAApprox  avgt   20    37.290 ±   0.979  ns/op
ClockBenchmark.testJNR        avgt   20   279.229 ±  16.476  ns/op
ClockBenchmark.testJNRApprox  avgt   20    38.222 ±   2.153  ns/op

Note: The "approx" benchmarks use Sylvain's method.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Feb 15, 2016

Squashed and rebased

* <p/>
* On Linux systems, however, it is possible to have a better granularity by using a JNA
* call to {@code clock_gettime}. To benefit from this system call, set the system
* property {@code com.datastax.driver.USE_NATIVE_CLOCK} to {@code true}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now set to true by default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will amend, thanks.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Feb 15, 2016

All remarks addressed, thanks for your feedback.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Feb 15, 2016

Squashed again in preparation for JAVA-444, sorry for the inconvenience.

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Feb 23, 2016

Rebased on current 2.1 (2708bd7)

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 1, 2016

Squashed and rebased on top of current 2.1.

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Mar 1, 2016

Rebased again, with a few minor changes (mainly doc).

* @return the next timestamp to use
*/
protected long computeNext(long last) {
long current = clock.currentTimeMicros();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there be value in adding a warning statement if we detect current being 1000ms less than last? That could indicate a clock drift, which might be worth logging (although we wouldn't want to spam it(.

libc = LibraryLoader.create(LibC.class).load("c");
runtime = Runtime.getRuntime(libc);
} catch (Throwable t) {
LOGGER.info("Could not load JNR LibC Library, native calls will not be available");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice to include the throwable stack (maybe at DEBUG only though?) So it's easier to understand why it couldn't be loaded. In one case I had a classloader issue which wasn't obvious so it would be good to have a way to expose the exception.

@tolbertam
Copy link
Copy Markdown
Contributor

The OSGI example suite was failing because jnr was missing, I created java727-osgi (commit 9e6e7b6) which may resolve this. However, the solution isn't that simple because some modules JNR depends on are not configured as OSGi modules. These are:

When building & installing the maven modules w/ those PRs (and removing SNAPSHOT from the versions), the OSGi tests pass.

I think if we want OSGi work in the near term, we should make the JNR dependency optional and have the driver OSGi module not import their packages. If those PRs get merged and versions with them released, we could add JNR back.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 4, 2016

Thank you very much @tolbertam for catching the OSGi problem. After reviewing your branch another possible solution came to me:

Actually JNR is de facto optional for the driver: if the dependency is missing, the driver falls back to SystemClock automatically. So my idea was to make it optional for OSGi bundles only. See here.

The OSGi tests pass with this configuration, and we do see the warning:

[FelixStartLevel] INFO com.datastax.driver.core.Native - Could not load JNR LibC Library, native calls will not be available (set this logger level to DEBUG to see the full stack trace)

I also cherry-picked your changes to the OSGi application.

@tolbertam
Copy link
Copy Markdown
Contributor

Nice @adutra! That was not something I was aware was possible. Given this and that JNR cannot work in OSGi without the aforementioned PRs against jffi and jnr-x86asm, do we even need jnr.* in Import-Packages?

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 4, 2016

we even need jnr.* in Import-Packages?

Right now, no; but my idea was that if one day those PRs are merged and OSGi bundles are available for all JNR libraries, then people could start including JNR in their OSGi runtimes and the driver bundle would simply detect that the package is present and activate it.

@tolbertam
Copy link
Copy Markdown
Contributor

Right now, no; but my idea was that if one day those PRs are merged and OSGi bundles are available for all JNR libraries, then people could start including JNR in their OSGi runtimes and the driver bundle would simply detect that the package is present and activate it.

Got it, that makes sense, thanks for clearing that up. I need to look into whether or not it will actually work with just jnr.*. I messed around quite a bit with my environment last night, so it's possible it does work. Will try it with the OSGified modules to see if it works.

@tolbertam
Copy link
Copy Markdown
Contributor

Looks like it doesn't work, because it doesn't pull in jnr.ffi.provider into Import-Packages for some reason, but it does pull in others:

Manifest-Version: 1.0
Bnd-LastModified: 1457105298481
Build-Jdk: 1.8.0_60
Built-By: atolbert
Bundle-Description: A driver for Apache Cassandra 1.2+ that works exclus
 ively with the Cassandra Query Language version 3 (CQL3) and Cassandra'
 s binary protocol.
Bundle-License: http://www.apache.org/licenses/LICENSE-2.0.txt
Bundle-ManifestVersion: 2
Bundle-Name: DataStax Java Driver for Apache Cassandra - Core
Bundle-SymbolicName: com.datastax.driver.core
Bundle-Version: 2.1.10.SNAPSHOT
Created-By: Apache Maven Bundle Plugin
Export-Package: com.datastax.driver.core;uses:="com.codahale.metrics,com
 .datastax.driver.core.exceptions,com.datastax.driver.core.policies,com.
 google.common.reflect,com.google.common.util.concurrent,io.netty.bootst
 rap,io.netty.buffer,io.netty.channel,io.netty.channel.socket,io.netty.h
 andler.codec,io.netty.util,javax.net.ssl,jnr.ffi,jnr.ffi.annotations,or                                                                                                              
 g.slf4j";version="2.1.10.SNAPSHOT",com.datastax.driver.core.exceptions;
 uses:="com.datastax.driver.core";version="2.1.10.SNAPSHOT",com.datastax
 .driver.core.policies;uses:="com.datastax.driver.core,com.google.common
 .base";version="2.1.10.SNAPSHOT",com.datastax.driver.core.querybuilder;
 uses:="com.datastax.driver.core,com.datastax.driver.core.policies";vers
 ion="2.1.10.SNAPSHOT",com.datastax.driver.core.schemabuilder;uses:="com
 .datastax.driver.core,com.google.common.base";version="2.1.10.SNAPSHOT"
 ,com.datastax.driver.core.utils;uses:="com.google.common.util.concurren
 t";version="2.1.10.SNAPSHOT"
Import-Package: com.google.common.base;version="[14.0,19)",com.google.co
 mmon.collect;version="[14.0,19)",com.google.common.primitives;version="
 [14.0,19)",com.google.common.reflect;version="[14.0,19)",com.google.com
 mon.util.concurrent;version="[14.0,19)",jnr.ffi;resolution:=optional;ve
 rsion="[2.0,3)",jnr.ffi.annotations;resolution:=optional;version="[2.0,
 3)",com.codahale.metrics;version="[3.0,4)",io.netty.bootstrap;version="
 [4.0,5)",io.netty.buffer;version="[4.0,5)",io.netty.channel;version="[4
 .0,5)",io.netty.channel.group;version="[4.0,5)",io.netty.channel.nio;ve
 rsion="[4.0,5)",io.netty.channel.socket;version="[4.0,5)",io.netty.chan
 nel.socket.nio;version="[4.0,5)",io.netty.handler.codec;version="[4.0,5
 )",io.netty.handler.ssl;version="[4.0,5)",io.netty.handler.timeout;vers
 ion="[4.0,5)",io.netty.util;version="[4.0,5)";resolution:=optional,io.n
 etty.util.concurrent;version="[4.0,5)";resolution:=optional,javax.namin
 g,javax.naming.directory,javax.net.ssl,net.jpountz.lz4;resolution:=opti
 onal,org.HdrHistogram;resolution:=optional,org.slf4j;version="[1.7,2)",
 org.xerial.snappy;version="[1.0,2)";resolution:=optional
Tool: Bnd-2.1.0.20130426-122213
[FelixStartLevel] INFO com.datastax.driver.core.Native - Could not load JNR LibC Library, native calls will not be available
java.lang.IllegalArgumentException: interface jnr.ffi.provider.LoadedLibrary is not visible from class loader
    at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:581)
    at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:557)
    at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:230)
    at java.lang.reflect.WeakCache.get(WeakCache.java:127)
    at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:419)
    at java.lang.reflect.Proxy.newProxyInstance(Proxy.java:719)
    at jnr.ffi.LibraryLoader.createErrorProxy(LibraryLoader.java:306)
    at jnr.ffi.LibraryLoader.load(LibraryLoader.java:301)
    at jnr.ffi.LibraryLoader.load(LibraryLoader.java:269)
    at com.datastax.driver.core.Native.<clinit>(Native.java:75)
    at com.datastax.driver.core.ClockFactory.newInstance(Clock.java:51)
    at com.datastax.driver.core.AbstractMonotonicTimestampGenerator.<init>(AbstractMonotonicTimestampGenerator.java:37)
    at com.datastax.driver.core.AtomicMonotonicTimestampGenerator.<init>(AtomicMonotonicTimestampGenerator.java:42)
    at com.datastax.driver.osgi.impl.Activator.start(Activator.java:44)
    at org.apache.felix.framework.util.SecureAction.startActivator(SecureAction.java:697)
    at org.apache.felix.framework.Felix.activateBundle(Felix.java:2223)
    at org.apache.felix.framework.Felix.startBundle(Felix.java:2141)
    at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1368)
    at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308)
    at java.lang.Thread.run(Thread.java:745)

@tolbertam
Copy link
Copy Markdown
Contributor

This is what I needed to do to support JNR in OSGi while making it optional at the same time, seems to work on my branch:

<![CDATA[com.google.common*;version="[14.0,19)",jnr.*;resolution:=optional,jnr.ffi.provider;resolution:=optional,jnr.ffi.provider.jffi;resolution:=optional,jnr.ffi.provider.converters;resolution:=optional,com.kenai.jffi;resolution:=optional,*]]>

Shaded Imports

<![CDATA[com.google.common.*;version="[14.0,19)",jnr.*;resolution:=optional,jnr.ffi.provider;resolution:=optional,jnr.ffi.provider.jffi;resolution:=optional,jnr.ffi.provider.converters;resolution:=optional,com.kenai.jffi;resolution:=optional,!io.netty.*,javax.security.cert,*]]>

Unfortunately all of those packages must be explicitly listed as wildcarding doesn't seem to work (it must have something to do with how things are loaded).

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 4, 2016

Unfortunately all of those packages must be explicitly listed as wildcarding doesn't seem to work (it must have something to do with how things are loaded).

That's unfortunate :( shouldn't we simply exclude JNR from the OSGi bundle right now (!jnr.*) and if/when everything is available as bundles, change it later?

@tolbertam
Copy link
Copy Markdown
Contributor

That's unfortunate :(

Yeah, definitely. Have tried a number of permutations and that's the only one I can seem to make work both ways.

shouldn't we simply exclude JNR from the OSGi bundle right now (!jnr.*) and if/when everything is available as bundles, change it later?

I'm 👍 on that idea if it works. Making it explicit that we don't import it should hopefully make it clear to users that we don't support using JNR with OSGi.

@tolbertam
Copy link
Copy Markdown
Contributor

It looks like !jnr.* works, so I'm 👍 on that.

long lastWarning = lastClockDriftWarning.get();
if (now > lastWarning + CLOCK_DRIFT_WARNING_INTERVAL && lastClockDriftWarning.compareAndSet(lastWarning, now)) {
if (current == last) {
LOGGER.warn(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you noticed when running the OSGi tests that it occasionally logs:

[main] WARN com.datastax.driver.core.TimestampGenerator - Clock drift detected: same timestamp was generated twice (1457112131130000), returned timestamps will be artificially incremented to guarantee monotonicity. This message will only be logged once every second.

With a system clock this seems very likely since you can have concurrent access in the same millisecond. I'm wondering if we should consider removing this particular warning and depending on the 'Clock skew' one instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That, or make lastClockDriftWarning static maybe?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making it static does seem like a good idea. I think this will still happen though, because clock.currentTimeMicros() for SystemClock is only going to have millisecond resolution and the likelihood of multiple calls in an ms seems very high. The previous implementation only printed a warning if 999 entries were created for the same millisecond, so it was much less sensitive to this.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 4, 2016

It looks like !jnr.* works, so I'm 👍 on that.

Done!

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Mar 7, 2016

The clock drift warning condition is too aggressive: with a simple load test I can reproduce "clock drift" almost every second, and even "clock skew" regularly. This is with the microsecond-resolution clock; with System.currentTimeMillis(), the warning appears as soon as two timestamps are generated in the same millisecond.
Granted, my example is contrived, but if I can reproduce it on my machine it's not unlikely that it could happen in a high-throughput environment.
I think a better test would be last > current + delta, with delta provided as a constructor parameter, with a sensible default (1 second comes to mind).

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Mar 7, 2016

Pushed PoC for a "drift threshold" (some javadocs missing, will complete later).

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 8, 2016

The threshold improves things, but I'm more and more thinking that issuing warnings might not be the best approach.

We could also provide a new Metrics meter, e.g. a Timer that would record the N last abnormal deltas (i.e. deltas for which last - current <= 0, with maybe a threshold other than zero to filter out minor skews). This way users could query this metrics to find out if there were skews (timer.getCount() > 0) and what was the average delta for them (timer.getSnapshot().getMean()).

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Mar 8, 2016

Metrics complicate things though, because we'd have to link timestamp generators to the Cluster instance in some way (right now they are standalone components). We shouldn't overthink this too much in advance. I think a good compromise would be to make maybeLogWarning protected so that it can be overridden, and give it a more meaningful name like onClockDrift.
What I like about the configurable threshold is that you can completely disable warnings by setting it to a negative value.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 8, 2016

I think a good compromise would be to make maybeLogWarning protected so that it can be overridden, and give it a more meaningful name like onClockDrift.

Sounds like a good compromise.

@olim7t
Copy link
Copy Markdown
Contributor

olim7t commented Mar 8, 2016

New commit based on previous discussion.

@adutra
Copy link
Copy Markdown
Contributor Author

adutra commented Mar 9, 2016

Pushed a small commit that reorganizes the documentation among the class hierarchy; hope that's ok otherwise I can revert it.

@tolbertam
Copy link
Copy Markdown
Contributor

Looks great to me now 👍

@olim7t olim7t changed the title JAVA-727: Monotonic Timestamp generators should not provide the same timestamp more than once.. JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible. Mar 9, 2016
… + use microsecond precision when possible.

This commit introduces a new dependency to JNR, and uses the system call
'gettimeofday' when available to generate timestamps with microsecond
precision.
olim7t added a commit that referenced this pull request Mar 9, 2016
JAVA-727: Allow monotonic timestamp generators to drift in the future + use microsecond precision when possible.
@olim7t olim7t merged commit fc8ef94 into 2.1 Mar 9, 2016
@olim7t olim7t deleted the java727 branch March 9, 2016 21:07
@barhouum7
Copy link
Copy Markdown

barhouum7 commented Dec 7, 2021

But how to make the JNR optional for OSGi bundles only?? PLEASE HELP ME!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants