Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setOneSchema in SchemaCache is never triggered, many getOneSchema results in misses #4530

Closed
tolgaatam opened this issue Jan 30, 2018 · 10 comments · Fixed by #5612
Closed

setOneSchema in SchemaCache is never triggered, many getOneSchema results in misses #4530

tolgaatam opened this issue Jan 30, 2018 · 10 comments · Fixed by #5612
Assignees
Labels
type:feature New feature or improvement of existing feature

Comments

@tolgaatam
Copy link

tolgaatam commented Jan 30, 2018

Issue Description

In SchemaCache, there is the function setOneSchema for setting individual schemas of classes in the cache. this function is never utilized and there is never individual schemas accumulating in the cache.
however, getOneSchema is being used and it always results in one miss before the MAIN_SCHEMA is looked up for that class, because the schema for that class would never have been set before.

Steps to reproduce

  • set enableSingleSchemaCache to true (same could happen without this option set to true, but my setup is like this)
  • use Redis adapter for easier investigation, or console.log()'s for the LRU cache might work as well. if used redis, use redis-cli with monitor option to check real-time logs
  • send some requests from the client quickly (because there are ttl's for keys :) ) to make schema caching and lookup work, and produce some logs.
  • check redis-cli or default console.log()'s for caching logs.

Expected Results

After a lookup for one class misses, schema info related to that class would be inserted to the cache so that consecutive lookups won't miss.

Actual Outcome

no inserts regarding single class schemas are done, however lookups are still being done, resulting in many misses and useless round trips.

Environment Setup

  • Server
    • parse-server version (Be specific! Don't say 'latest'.) : 2.6.5
    • Operating System: Amazon Linux
    • Hardware: t2.nano (512 MB RAM, 1 CPU, EBS Storage)
    • Localhost or remote server? : AWS

Logs/Trace

Following is some logs from redis-cli, I traced these logs for minutes and no sets (psetex) are done for individual schemas

1517286114.528181 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.529996 [] "get" "xxbackend:__SCHEMAPhoto"
1517286114.530845 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.790274 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.791514 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.794785 [] "get" "xxbackend:__SCHEMAPost"
1517286114.795592 [] "get" "xxbackend:__SCHEMALike"
1517286114.796424 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.797531 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.801867 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.803162 [] "get" "xxbackend:__SCHEMAPost"
1517286114.804042 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.806571 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.808008 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.809425 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.811327 [] "get" "xxbackend:__SCHEMAPost"
1517286114.812166 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.817408 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.818600 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.819734 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.820944 [] "get" "xxbackend:__SCHEMAPost"
1517286114.821744 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.822866 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.824021 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.825780 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.826933 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.828031 [] "get" "xxbackend:__SCHEMAView"
1517286114.828851 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.829946 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"
1517286114.831018 [] "get" "xxbackend:__SCHEMAPost"
1517286114.831819 [] "get" "xxbackend:__SCHEMA__MAIN_SCHEMA"

Suggested solution

Either SchemaCache should not do lookups for individual classes and directly request main schema, or after missed lookups of individual class schemas, inserts should be done.

@tolgaatam
Copy link
Author

tolgaatam commented Feb 15, 2018

Considering that this topic is about a real issue (not a question as many topics would be about) and clear indications of the bug is already presented, I do not see how nobody chose to comment on it or at least label it as a suspected bug or something.

@flovilmart
Copy link
Contributor

@tolgaatam do you have a suggestion for the resolution? Are you able to write a failing test in order ot make this easier to investigate? we run all the tests with the redis cache so that would be interesting to add it to our test harness.

@woodardj
Copy link

Not sure if this is related, and we're a bit behind on parse-server version (2.6.2), but I'm also seeing scads of calls to get __SCHEMA__MAIN_SCHEMA when using the redis CacheAdapter, as many as 17 or 23 in some cases, causing some pretty significant performance issues at times. Here's a NewRelic stacktrace with timings:

screen shot 2018-02-22 at 2 51 04 pm

If you think this is unrelated, I'm happy to log separately, but the initial report feels pretty similar to me.

@woodardj
Copy link

I have another trace in NewRelic during a PUT operation on _Installation which calls "Redis get" at least 6 times (likely closer to 20 times) on the __SCHEMA__MAIN_SCHEMA key. This is rapidly becoming our performance bottleneck.

@flovilmart
Copy link
Contributor

There's definitely room for improvement in here that would benefit everyone. The schema is read multiple times, and perhaps it would be better to refactor the code so the schema is less validated.
Is it something you'd be willing to improve?

@woodardj
Copy link

I'm currently digging through Parse Server's internals to try and figure out what's going on in these cases; unfortunately my priority has to be solving the acute issue for my client. If that involves addressing this library issue, I'll see if there's something I can issue a PR for, I'll do it, otherwise this was tagged 'needs-investigation', so in the meantime I'm adding details as I can.

@tolgaatam
Copy link
Author

sorry for not getting in touch with you through this issue. no, i am not able to contribute much to the topic other than helping with the addressing of the issue :/

@tolgaatam
Copy link
Author

tolgaatam commented Mar 20, 2018

the original issue that i address in the post stems from this piece of code in parse-server/src/Controllers/SchemaCache.js beginning from line 52

getOneSchema(className) {
    if (!this.ttl) {
      return Promise.resolve(null);
    }
    return this.cache.get(this.prefix + className).then((schema) => {
      if (schema) {
        return Promise.resolve(schema);
      }
      return this.cache.get(this.prefix + MAIN_SCHEMA).then((cachedSchemas) => {
        cachedSchemas = cachedSchemas || [];
        schema = cachedSchemas.find((cachedSchema) => {
          return cachedSchema.className === className;
        });
        if (schema) {
          return Promise.resolve(schema);
        }
        return Promise.resolve(null);
      });
    });
  }

issue can be resolved like this:

getOneSchema(className) {
    if (!this.ttl) {
      return Promise.resolve(null);
    }
    return this.cache.get(this.prefix + MAIN_SCHEMA).then((cachedSchemas) => {
      cachedSchemas = cachedSchemas || [];
      var schema = cachedSchemas.find((cachedSchema) => {
        return cachedSchema.className === className;
      });
      if (schema) {
        return Promise.resolve(schema);
      }
      return Promise.resolve(null);
    });
  }

so that individual class schema caches are never checked but MAIN_SCHEMA is pulled directly where schema information of all classes are available. the reason for doing this is that individual class schemas are never indeed pushed into the cache. if they were, this originial piece of code would work like a charm.
i do not change it myself as i am not much knowledgeable about the test processes and so on but somebody else may open a PR for that and i would appreciate that!

@tolgaatam
Copy link
Author

further investigated the issue and found from what the problem stemmed.

here is getOneSchema function from /src/Controllers/SchemaController.js line 458

  getOneSchema(className: string, allowVolatileClasses: boolean = false, options: LoadSchemaOptions = {clearCache: false}): Promise<Schema> {
    let promise = Promise.resolve();
    if (options.clearCache) {
      promise = this._cache.clear();
    }
    return promise.then(() => {
      if (allowVolatileClasses && volatileClasses.indexOf(className) > -1) {
        return Promise.resolve({
          className,
          fields: this.data[className],
          classLevelPermissions: this.perms[className],
          indexes: this.indexes[className]
        });
      }
      return this._cache.getOneSchema(className).then((cached) => {
        if (cached && !options.clearCache) {
          return Promise.resolve(cached);
        }
        return this._dbAdapter.getClass(className)
          .then(injectDefaultSchema)
          .then((result) => {
            return this._cache.setOneSchema(className, result).then(() => {
              return result;
            })
          });
      });
    });
  }

the last part does this:

  • check the cache for the schema of the class
    • if you found return it
    • if not, get it from database, set single class schema (wow what we need badly) and return what you got from db

in my previous comment there was the getOneSchema function from SchemaCache.js, which is called from SchemaController in this snippet. It first checks the cache for single class schema, if not found it checks the MAIN_SCHEMA and extract the schema of the class. Therefore, if MAIN_SCHEMA exists in the cache, single class checks always succeed. Therefore setOneSchema is rarely called.

In the long run, MAIN_SCHEMA can frequently be found in the cache => single schemas are not set => getOneSchema calls always miss the cache in the first trial => .. becomes a performance-dropping loop :/

@stale
Copy link

stale bot commented Sep 18, 2018

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New feature or improvement of existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants