Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix to get the correct value for Environment.TickCount in OSX #1269

Merged
merged 5 commits into from
Jul 28, 2015

Conversation

Sridhar-MS
Copy link

On linux monotonic clock API clock_gettime is available. But on OSX the PAL implementation falls back to gettimeofday. This always retured the time elapsed since Unix Epoch (1970/01/01). I fixed it by using the mach API clock_get_time on OSX to get the time elapsed since system start.

Fixes #2441
https://github.com/dotnet/corefx/issues/2441

CC: @stephentoub

On linux monotonic clock API clock_gettime is available. But on OSX the PAL implementation falls back to gettimeofday. This always retured the time elapsed since Unix Epoch (1970/01/01). I fixed it by using the mach API clock_get_time on OSX to get the time elapsed since system start.

Fixes #2441
https://github.com/dotnet/corefx/issues/2441
@dnfclas
Copy link

dnfclas commented Jul 22, 2015

Hi @Sridhar-MS, I'm your friendly neighborhood .NET Foundation Pull Request Bot (You can call me DNFBOT). Thanks for your contribution!

In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement. It's all electronic and will take just minutes. I promise there's no faxing. https://cla2.dotnetfoundation.org.

TTYL, DNFBOT;

@dnfclas
Copy link

dnfclas commented Jul 22, 2015

@Sridhar-MS, Thanks for signing the contribution license agreement so quickly! Actual humans will now validate the agreement and then evaluate the PR.

Thanks, DNFBOT;

clock_serv_t clock;
mach_timespec_t mts;

host_get_clock_service(mach_host_self(), SYSTEM_CLOCK, &clock);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaks the host port, no? Also, why are you not checking the return value here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I missed checking the return value here. But I am deallocating the host port using 'mach_port_deallocate' a bunch of lines below. I need to make it better to deallocate the host port even in case of failing return values. I am working on the fix.

- Add return value check for "host_get_clock_service".
- Fix host port leaks in GetTickCount64 and GlobalMemoryStatusEx.
@Sridhar-MS
Copy link
Author

I did some reference counting experiment on host ports. I discovered that there is one more place in the PAL layer (src/pal/src/misc/sysinfo.cpp) where the host port is leaking. I have fixed it too.

mach_timespec_t mts;
host_t host = mach_host_self();

if((machRet = host_get_clock_service(host, SYSTEM_CLOCK, &clock)) != KERN_SUCCESS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetTickCount is expected to be very fast API. Acquiring/releasing the clock service for each call looks pretty expensive. Could you please do some quick performance measuments, and maybe consider caching the clock service?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I did some crude performance measurements around Environment.TickCount.
With my changes (no caching) - Average time 3.0 microseconds
Without my changes (using gettimeofday) - Average time 0.13 microseconds.
I am not sure if perf cost is within acceptable level of tolerance. Meanwhile I am also trying to get perf measurements with caching the clock service.

It did my testing on release builds and got the average from 10 million iterations.

        const long tccSecondsToMicroSeconds = 1000000;
        [Fact]
        public static void TickCountPerf()
        {            
            int iterations = 10000000;
            long totalTime = 0;
            for (int i = 0; i < iterations; i++)
            {
                long startTime = GetTimeofDay();
                int tick = Environment.TickCount;
                totalTime += GetTimeofDay() - startTime;
            }

            float average = (float)totalTime / iterations;
            Console.WriteLine("totalTime - {0} usecs", totalTime);
            Console.WriteLine("Average - {0} usecs", average);
        }

        public static long GetTimeofDay()
        {
            timeval tv = new timeval();           
            int result = gettimeofday(ref tv, IntPtr.Zero);            
            return (tv.tv_sec * tccSecondsToMicroSeconds) + tv.tv_usec;
        }

        [DllImport("libc")]
        // provides microsecond level resolution
        private static extern int gettimeofday(ref timeval tv, IntPtr tz);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas I also got the results with caching the clock service.
With my changes (no caching) - Average time 3.0 microseconds.
With my changes (with caching) - Average time 0.9 microseconds.
Without my changes (using gettimeofday) - Average time 0.13 microseconds.

There is a minor increase in perf cost even with caching enabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The microbenchmark is not right. GetTimeofDay() method should not be called for every iteration because of it is about as expensive as Environment.TickCount that you are trying to measure. It should be called only once before and after the loop:

        public static void TickCountPerf()
        {            
            int iterations = 10000000;
            long startTime = GetTimeofDay();
            for (int i = 0; i < iterations; i++)
            {
                int tick = Environment.TickCount;
            }
            long totalTime = GetTimeofDay() - startTime;
            ...
    }

Once you fix the microbenchmark, I expect that you will get numbers ~0.03 microseconds without your change, and ~3 microseconds with your change. Making Environment.TickCount 100x slower is certainly not within acceptable level of tolerance.

For reference, Environment.TickCount takes less than 0.01 microseconds on Windows.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @jkotas! I updated the benchmark and got the results which are close to the numbers you predicted.

With my changes (no caching) - Average time 2.86 microseconds.
With my changes (with caching) - Average time 0.82 microseconds.
Without my changes (using gettimeofday) - Average time 0.05 microseconds.

So with caching enabled it is around it is around 16.4x slower.

Here is the updated microbench mark

        const long tccSecondsToMicroSeconds = 1000000;

        [Fact]
        public static void TickCountPerf()
        {            
            int iterations = 10000000;            
            long startTime = GetTimeofDay();

            for (int i = 0; i < iterations; i++)
            {               
                int tick = Environment.TickCount;

            }

            long totalTime = GetTimeofDay() - startTime;

            float average = (float)totalTime / iterations;
            Console.WriteLine("totalTime -- {0} usecs", totalTime);
            Console.WriteLine("Average -- {0} usecs", average);
        }

        public static long GetTimeofDay()
        {
            timeval tv = new timeval();           
            int result = gettimeofday(ref tv, IntPtr.Zero);            
            return (tv.tv_sec * tccSecondsToMicroSeconds) + tv.tv_usec;
        }

        [DllImport("libc")]
        private static extern int gettimeofday(ref timeval tv, IntPtr tz);
    }

    struct timeval
    {
        public long tv_sec; /* seconds since Jan. 1, 1970 */
        public long tv_usec; /* and microseconds */
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caching enabled it is around it is around 16.4x slower

It is better than before, but still ~100x slower compared to Windows...

I have looked around for what is available ... have you considered using mach_absolute_time()? It seems to be the fastest way to get what you want.

Also, I am wondering whether we need to make similar change in other places like https://github.com/dotnet/coreclr/blob/master/src/pal/src/synchmgr/synchmanager.cpp#L4881 to be robust around clock time adjustment (e.g. day light switch). This can be tracked as separate issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also looking into mach_absolute_time(). I did some initial perf test and I got the Average time down to 0.04 vs 0.05 without my changes. I will keep you updated on that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mach_absolute_time uses rdtsc

…ach_absolute_time' in GetTickCount64.

Monotonic clock function 'clock_get_time' was causing perf regression in 'Environment.TickCount'. So replacing it with 'mach_absolute_time' which gives considerable performance gains. Here are the perf numbers
Using 'clock_get_time' (without caching host ports) - Average time 2.86 microseconds.
Using 'clock_get_time' (with caching host ports) - Average time 0.82 microseconds.
Using 'gettimeofday' (original implementation) - Average time 0.05 microseconds.
Using 'mach_absolute_time' - Average time 0.04 microseconds.
@Sridhar-MS
Copy link
Author

@jkotas Here are the final perf numbers
Using 'clock_get_time' (without caching host ports) - Average time 2.86 microseconds.
Using 'clock_get_time' (with caching host ports) - Average time 0.82 microseconds.
Using 'gettimeofday' (original implementation) - Average time 0.05 microseconds.
Using 'mach_absolute_time' - Average time 0.04 microseconds.

I was initially was not sure if 'mach_absolute_time' was monotonic, so I went ahead with 'clock_get_time'. Now I verified that both 'clock_get_time' and 'mach_absolute_time' return the exact same time elapsed. With that said I replaced the implementation with 'mach_absolute_time' because of the better performance numbers.

@@ -298,4 +320,4 @@ GetTickCount64()
#endif // HAVE_CLOCK_MONOTONIC
EXIT:
return retval;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add new line at the end of the file?

…nd resolution support for StopWatch in OSX.

OSX provides nanosecond timer capabilities. But the PAL was not using it and falling back to microsecond-level resolution.
SET_DEFAULT_DEBUG_CHANNEL(MISC);

static BOOL s_TimeInitialized = FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable should not be needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am new to the PAL source code, so I am trying to follow existing conventions. Example - https://github.com/dotnet/coreclr/blob/master/src/pal/src/exception/machexception.cpp#L74.

So I believe that if you are not freeing any resources in cleanup you dont need an 'init' variable.

@jkotas
Copy link
Member

jkotas commented Jul 25, 2015

LGTM modulo comments

@kangaroo
Copy link

I mentioned this in a now outdated comment, but mach_absolute_time uses rdtsc. My understanding is TickCount is supposed to be monotonic?

@Sridhar-MS
Copy link
Author

@kangaroo I had the same concern initially. But in the code we are rounding off nanosecond measurement into milliseconds. This makes the TickCount value to be non-decreasing ie. any jump back and forth by mach_absolute_time is offset by this rounding off.

As an experiment, I ran both mach_absolute_time and monotonic function clock_get_time in a loop for 10 million times and rounding off both values to milliseconds. The value returned by mach_absolute_time was always non-decreasing and same as the value returned by clock_get_time.

@kangaroo
Copy link

@Sridhar-MS Did you force core migrations, cpu frequency scaling, and S3 power switching?

@Sridhar-MS
Copy link
Author

@jkotas Made fixes based on your feedback.
@kangaroo Thanks for the feedback. I have filed an issue #1300 to track it.

Sridhar-MS added a commit that referenced this pull request Jul 28, 2015
Fix to get the correct value for Environment.TickCount in OSX
@Sridhar-MS Sridhar-MS merged commit 9553016 into dotnet:master Jul 28, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tests fail with System.BadImageFormatException
5 participants