Product management (1st part) #773

whisperity · 2017-08-07T08:44:59Z

Partially implements #691.

❗ Backward compatibility has broken

This patch massively breaks database backwards compatibility, because the meaning of the DB arguments changed! Do NOT start this without saving out your ~/.codechecker/codechecker.sqlite file, or your PSQL database. Most likely data will not be lost (as the table names are orthogonal), but you won't be able to access the run results anymore, at least not from CodeChecker.

This patch implements the first half of the product management system.

Changes summed up

The server process now organises run databases into products, which are unique database (files or psql connections), each of which are a "run database". (Run database is the new identifier name for what was the CodeChecker database before this patch. The other database is called the config database.)
Products are accessible through host:port/ProductID (localhost:8001/Default), in the browser. This is the old CodeChecker page listing the runs in that product.
The command-line also takes, instead of --host and --port, an --url argument, which is formatted exactly like the new web URL to access a product, that is, localhost:8001/Default.
Initial start of a SQLite server automatically creates this Default product for ease of use.
Tests have been rewritten to use the new structure. Testing time takes around half with this change, as the tests no longer bring up a server and wait for a database on every test.
The new homepage, the product list, automatically redirects to the only product's run list, if there is only one product configured, yet again, to ease the use for those who don't want to delve into this extensive feature.

Some stuff, such as below, are yet to be implemented:

Sanitising product URL parts (endpoints) from problematic special chars.
Handling if the database dies out mid-operation more gracefully. (Currently the server will continuously respond This product does not exist. It does not bring down the server, but it certainly can be more graceful, it will be later on.)
The concepts of root, superuser, product admin does not apply. If you have access to the server, you have access to every product whatsoever.

The "webadmin" pages are missing, too. There is currently only way to manage products is via the command-line.

Xazax-hun · 2017-08-07T09:23:42Z

alembic.ini

+# WARN: This needs to be changed before "alembic" calls depending on
+# which database we are revisioning, the RUN or the PRODUCT CONFIG one.
+
+# Value for RUN databases:


I'd rather have two ini files if that is possible than having to change this.

Not sure if alembic supports such.

We can either have separate directories with separate alembic.ini, or we can share the ini file:

f you really wanted two completely independent sets of migration scripts, then you'd run two migration environments. They can share the same alembic.ini like this: [my_db_one] sqlalchemy.url = [my_db_two] sqlalchemy.url = you then run alembic with "alembic -n my_db_one" or "alembic -n my_db_two". The "default" config area is set by -n. A single env.py script can get multiple database URLs in any way it wants, as it determines how config is accessed. If you look in the multidb/env.py script, you'll see it's pulling multiple database urls from one section using config.get_section(name) - config file: [alembic] # path to migration scripts script_location = ${script_location} # template used to generate migration files # file_template = %%(rev)s_%%(slug)s databases = engine1, engine2 [engine1] sqlalchemy.url = driver://user:pass@localhost/dbname [engine2] sqlalchemy.url = driver://user:pass@localhost/dbname2 usage: config = context.config db_names = config.get_main_option('databases') for name in re.split(r',\s*', db_names): engines[name] = rec = {} rec['engine'] = engine_from_config( config.get_section(name), prefix='sqlalchemy.', poolclass=pool.NullPool) Over here I have both forms of multi db at the same time. There's two migration environments, and one migration environment does two databases that are largely mirrored, so three databases total. All three make use of a common env.py script that's in my application as a library, they then implement an env.py in the migration environment that draws upon the myapp/lib/env.py script for common features. You can pass instructions to a single env.py that may be controlling multiple databases using --tag: "alembic --tag my_tag" "my_tag" is available in env.py as context.get_tag_argument(). You can use that to conditionally run migrations on one database or the other. This is all DIY. Multi-database migrations can happen in many different ways so you'd need to build the approach that suits your situation best.

@Xazax-hun Thanks for clearing this up. I've updated the script.

Xazax-hun · 2017-08-07T09:24:07Z

api/products.thrift

+
+
+/*
+struct PrivilegeRecord {


I do not like the idea of checking in commented code.

This is in line with the not yet fully committed database tables for product access, in the config_db_model. I'd rather not remove this, as these structs and object models set forth the further development line.

Xazax-hun · 2017-08-07T09:26:03Z

docs/user_guide.md

-* `8001` - CodeChecker result viewer
+* `8001` - CodeChecker server
+
+The server listens only on the local machine.


Maybe we should mention that there is a command line argument to not to constrain the listening to the local machine?

That is why it says Default configuration. If someone checks the help for server, they will see.

Xazax-hun · 2017-08-07T09:33:57Z

libcodechecker/cmd/product_client.py

+    LOG.debug("Sending request to add product...")
+    success = client.addProduct(prod)
+    if success:
+        LOG.info("Added successfully.")


I'd rather have a more descriptive info message, like Product added successfully.

Xazax-hun · 2017-08-07T09:38:00Z

libcodechecker/libclient/client.py

+    Check if server product API is supported by the client.
+    """
+
+    version = client.getAPIVersion()


Is it intended that the same version queried here and above? Also, this functionality is inside the code for at least 3 times, (e.g. for database versioning). Maybe we should not duplicate this that many times but only have one version checking function since we are using the same versioning schema for everything?

It is not the same version. There is a database version, a product API version, and a run API version. The fact that they all are 6.0 right now is merely coincidental, as we are heavily breaking API and not versioning it at this point, but there will be one release which will set forth versioning from 6.0 and onwards.

Same as how the database migration is not handled, as the whole infrastructure in a big WIP.

Do we want separate versions for those APIs?

I would, yes. It could very well be that the run API changes earlier than the product API. They are very essentially two different systems, which could have two different clients, etc.

So is it likely to have a client that is only interested in one of the APIs and not the other?

Yes. If a client can make sure through every other means that the product exists (and later on, the user has rights to access), then it need not access the product-api at all.

But... then again, is it likely that we will have clients not our own? 😁

I'll eliminate these and put the version into the shared file. But the code that checks the version here must stay, because this function and the other can be called entirely separately. (One client is the one that returns the same version from endpoint Products, the other from CodeCheckerService.) Two distinct control flow, even though the value right now is the same, and even if the value will be made actually the same constant all the time.

Xazax-hun · 2017-08-07T09:39:43Z

libcodechecker/libclient/product_helper.py

+
+import os
+import sys
+# import datetime


Commented code.

Keeps this file in accordance with the other two helpers in the module.

Xazax-hun · 2017-08-07T09:40:04Z

libcodechecker/libclient/product_helper.py

+                       "=" + session_token}
+            self.transport.setCustomHeaders(headers)
+
+            # ------------------------------------------------------------


I do not like these kind of comments.

Keeps this file in accordance with the other two helpers in the module.

I see, but why do we have a separate ThriftClientCallwrapper? Shouldn't we reduce the duplication?

The reason is unknown. I think back when doing the Authentication module we decided to keep the pattern. Also, different endpoint clients could have different ways of handling the received exceptions. I think there are subtle differences in the except statements already across the modules.

I would rather not introduce extra refactorings into this package. The scheme was already given and existing in the code, and I like keeping that.

Xazax-hun · 2017-08-07T09:42:10Z

libcodechecker/libhandlers/cmd.py

+                                  help="The URL of the product to store the "
+                                       "results for, in the format of "
+                                       "host:port/ProductName.")
+    elif not needs_product_url:


Shouldn't this be a simple else instead?

The comment says above, that needs_product_url handles None, True and False separately.

None indicates that the command does not connect to any sort of server, and no arguments related to connection are, thus, added.

If it is not None, then:

True means command connects to a product (most of the cmd subcommands), and this code includes --url to the arg list.

False means command connects to a server (such as products, and login), and the code includes --host and --port.

Yeah, but I was wondering, if None and True already handled, there is only one case left.

Sure, I could change. I would have opted for a more explicit code, due to lack of a trit type.

Xazax-hun · 2017-08-07T09:54:17Z

libcodechecker/libhandlers/cmd.py

+            "specified! The database specified here must exist, and be "
+            "connectible by the server.")
+
+        # TODO: --dbSOMETHING arguments are kept to not break interface from


Why? I mean, we do not need to preserve backward compatibility.

These TODOs will be cleared later on, as part of #745. Right now, this is here, to keep the argument list akin to that of server.

Xazax-hun · 2017-08-07T10:05:05Z

libcodechecker/server/client_db_access_server.py

+
+        # The split array looks like ['', 'product-name', ...].
+        first_part = urlparse.urlparse(self.path).path.split('/', 2)[1]
+        return first_part if first_part not in NON_PRODUCT_NAMES else None


Wouldn't be better to have an URL like http://localhost:8001/product/[product-name]/#{request-parts} so we do not need to maintain a list like NON_PRODUCT_NAMES?

Would make the command line and the thrift client creation lot more complicated. (*) This list doesn't really need to be maintained, as there is almost no chance that a new subfolder under www-root is added, as we already have every "resource folder" (fonts, styles, etc.) present and used.

(*): Right now, due to how URLs work, setting up the API endpoint in Thrift in the web code like new Thrift("Authentication") automatically puts it under /[product-name]/Authentication. Do we want to make it /product/[product-name]/service/Authentication or... something else? It would make the webserver code a lot more complicated, as we need to start actually parsing the active URL there.

Xazax-hun · 2017-08-07T14:01:02Z

api/products.thrift

+}
+
+/* ProductConfiguration carries administrative data regarding product settings. */
+struct ProductConfiguration {


This is almost the same as Product. Do we need both of these?

We will, yes. Product will be extended with extra data, and Product carries information for users.

ProductConfiguration on the other hand carries configuration data, such as database stuff, which is applicable for the - upcoming - Web admin and command line management tools.

Xazax-hun · 2017-08-07T14:02:22Z

api/products.thrift

+  3:          i32    port,
+  4:          string username_b64,
+  5: optional string password_b64,   // Database password is NOT sent server->client!
+  6:          string database


Maybe a more descriptive name like database_name?

Would not be more descriptive. If engine is sqlite, this is a database path. If postgresql, this is a database. (And almost all connection strings in the world name this field as database.)

A comment would be nice what could be the values of it.

Any new comments on this?

Xazax-hun · 2017-08-07T14:02:59Z

api/products.thrift

+  1:          string engine,         // The database engine, such as "sqlite" or "postgresql".
+  2:          string host,
+  3:          i32    port,
+  4:          string username_b64,


What is the b64 suffix? Do we really want to encode this in base64 also in the future?

In the future, no. But right now, Thrift is pretty much messed up and there isn't a good way to get over this. This was a decision to be made. Later on, when the upstream packages are updated, the API will be changed.

What if the just put a comment that this value should be b46 encoded?

Isn't changing how data is transmitted considered a backward-incompatible API break? In this case, the rename of the variable would automatically make the break for us.

Xazax-hun · 2017-08-07T14:08:30Z

docs/postgresql_setup.md

 ~~~~~~

-Start web server to view the results.
+Check a test project.


The check sub command no longer starts the server even when we are using SQLite?

Nope. Since the "one storage and viewer server" update in 5.9, server needed to be started before a check. This update only changes so that check takes URL instead of host-port.

Xazax-hun · 2017-08-07T14:13:08Z

docs/user_guide.md

@@ -831,13 +836,13 @@ CodeChecker server -w ~/codechecker_wp --dbname myProjectdb --dbport 5432 --dbad
 The checking process can be started separately on the same machine

 ~~~~~~~~~~~~~~~~~~~~~
-CodeChecker check  -w ~/codechecker_wp -n myProject -b "make -j 4" --host localhost --port 8001
+CodeChecker check  -w ~/codechecker_wp -n myProject -b "make -j 4" --url localhost:8001/Default


Do we need to spell out the Default? Can't we omit it?

In theory, you can emit the entire --url parameter, as it defaults to localhost:8001/Default automatically. However, I think it is better, if the very advanced usage guides explicitly state that this argument is there.

Just as how, before the change, --host and --port was there, even though it was not needed, as this example here set them to their default values already.

Xazax-hun · 2017-08-07T14:22:24Z

libcodechecker/libclient/client.py

@@ -156,7 +148,93 @@ def setup_client(host, port, uri):
                      "login'.")
            sys.exit(1)

-    client = thrift_helper.ThriftClientHelper(host, port, uri, session_token)
+
+def setup_product_client(host, port, product_name=None, already_authed=False,


Is it a good idea to have an already_authed argument? Can't you query that using the session token somehow?

Xazax-hun · 2017-08-07T14:24:43Z

libcodechecker/libclient/client.py

+    product = product_client.getProducts(product_name, None)
+    product_error_str = None
+    if not (product and len(product) == 1):
+        product_error_str = "It does not exist."


Maybe Product does not exist.?

If you look at the context of this function, you'll see that the full error printed to the user is formatted as:

The given product {URL_ENDPOINT} cannot be used! {ERROR_REASON}

It would be superfluous to say "product" once again.

Xazax-hun · 2017-08-07T14:26:33Z

libcodechecker/libclient/product_helper.py

+                       "=" + session_token}
+            self.transport.setCustomHeaders(headers)
+
+            # ------------------------------------------------------------


I see, but why do we have a separate ThriftClientCallwrapper? Shouldn't we reduce the duplication?

Xazax-hun · 2017-08-07T14:45:18Z

libcodechecker/server/client_db_access_server.py

+            self.__connected = False
+            self.__last_connect_attempt = (datetime.datetime.now(), ex.message)
+
+    def teardown(self):


Does it make sense to make this a context manager so it can work with with statements.

It doesn't. These database connection engines, just like the config one, are shared and alive on the entire server. These methods only exist to support loading and unloading databases mid-operations.

Contexts are created with the scoped_session objects instantiated from the session_factory property, see do_POST for reference.

Xazax-hun · 2017-08-07T14:47:54Z

libcodechecker/server/client_db_access_server.py

-        Session.configure(bind=self.__engine)
-        self.sc_session = Session
+        self.context = context
+        self.check_env = check_env


Why do we need these new fields?

They store the external environment "pointers" from libhandlers/server.py, where this class is theoretically initialised. Otherwise, at every product database connection, the context configuration files and the checker environment would need to be loaded, which results in an increased disk usage, and an increased bloat in the log output.

csordasmarton · 2017-08-07T19:35:02Z

www/scripts/productlist/ListOfProducts.js

+function (declare, domConstruct, ItemFileWriteStore, topic, Button,
+  TextBox, BorderContainer, ContentPane, DataGrid, util) {
+
+  function prettifyDuration(seconds) {


This file contain the same function: https://github.com/whisperity/CodeChecker/blob/product_management/www/scripts/codecheckerviewer/ListOfRuns.js#L22.
Move it to a common place such as util.js.

Thank you for noticing this. It was an unused method in this file anyways.
(Needless to say I used ListOfRuns as a source when I started doing this page 😀 )

csordasmarton · 2017-08-07T19:46:32Z

api/products.thrift

+}
+typedef list<Product> Products
+
+service productService {


The other service names start with the prefix codeChecker:

codeCheckerAuthentication

codeCheckerDBAccess

I think we should use the same convention here (E.g.: codeCheckerProduct)

Xazax-hun

Some of my comments are not related to the diff but to how the code is organized right now. If you do not want to deal with those problems now, I'm fine with adding a TODO comment.

Xazax-hun · 2017-08-10T12:15:58Z

tests/libtest/thrift_client_to_db.py

 def get_auth_client(port, host='localhost', uri='/Authentication',
                    auto_handle_connection=True, session_token=None):
    return CCAuthHelper(host, port, uri,
                        auto_handle_connection,
                        session_token)
+
+
+def get_product_client(port, host='localhost', uri='/Products',


Why do we have these simple forwarding functions rather than instantiating the clients?
(And why do they have helper in their names? )

Ask the original author of the test suite, the one who wrote the functions above. I kept myself to this scheme.

They have the name Helper in their types just as how the Thrift client helpers (call stub forwarders) in libcodechecker/libclient do. These objects are roughly the same as those, but they don't expose their method stubs to the environment either, and act as a true low-level forwarders... whereas those in libclient have at least a method stub defined.

These even more low-level stubs exist here so theoretically, the test uses uses the least possible code from libcodechecker.

Xazax-hun · 2017-08-10T12:17:35Z

tests/libtest/thrift_client_to_db.py

-        # import only if necessary; some tests may not add this to PYTHONPATH
+    def __init__(self, host, port, product, endpoint,
+                 auto_handle_connection=True, session_token=None):
+        # Import only if necessary; some tests may not add this to PYTHONPATH.


Is this a good pattern? Wouldn't we end up importing these twice? Does that have a negative side effect?

Importing a module multiple times is not a problem docs. It only gives a performance overhead on name resolution.

A module can contain executable statements as well as function definitions. [...] They are executed only the first time the module name is encountered in an import statement.

But even if importing the same module multiple times would execute, the things imported here are functions and constant, which are stateless.

Xazax-hun · 2017-08-10T12:21:40Z

tests/libtest/env.py

 from thrift_client_to_db import get_auth_client

 from functional import PKG_ROOT
 from functional import REPO_ROOT


+def get_free_port():


Is this a good way to get a free port? Once we closed the socket, the OS might give this port to someone else, so there might be a race condition (although it should be pretty rare to run into this problem, and this is in test code anyways)

It is absolutely not a good way, due to that problem.
It is also absolutely a method that has been moved from another location to this place.

But since we only ever set up a single server (excluding some certain more instristic test cases), this method is used way less times than before this patch. AFAIK, we call this 4 times, as opposed to the around 20-25 (postgresql AND server port was retrieved by this!) previously.

Great! I would add a TODO though.

TODO added.

Xazax-hun · 2017-08-10T12:26:52Z

tests/functional/diff/test_regression_diff.py

@@ -71,6 +71,10 @@ def setUp(self):
        self._test_config = env.import_test_cfg(test_workspace)
        self._run_names = env.get_run_names(test_workspace)

+        self._url = self._test_config['codechecker_cfg']['viewer_host'] + \


I saw similar code multiple times. Maybe a config->url converter utility would be helpful? Or you might even have one somewhere?

It appeared in 3 places. I've factored it out.

Xazax-hun · 2017-08-10T12:35:31Z

libcodechecker/server/product_db_access_handler.py

+        """
+
+        try:
+            product = self.__session.query(Product) \


Maybe you could use get instead of filter and early return when the result is None? (So we do not fail when the clients try to remove a non-existing product.)

Isn't attempting to remove a non-existent thing a failure? I agree with an early-"return", but in form of an early raise.

That is fine for me as well.

Xazax-hun · 2017-08-10T12:40:03Z

libcodechecker/server/product_db_access_handler.py

+                                              msg)
+
+    @timeit
+    def getCurrentProduct(self):


What is the current product? Where is the state stored? Is this per connection? When and why would a client query this?

The line immediately after the comment of this method points you to where the "current product" is retrieved from. It is in self.__product. This is populated by the router, if one connects to this endpoint as /[product-name]/ProductService instead of /ProductService.

Currently, only one place uses it. The run viewer client (the old home page) queries the name of the current product via this endpoint (setting up the API call in the aforementioned manner), and puts it into the browser's title page, and presents it on the web page body too.

I see, but I was wondering whether self.__product is per connection or what is its lifetime. So should we afraid of race conditions? I would rather eliminate a state like this in that case.

The variable pointed by self.__product has a global lifetime of the entire server, as this is a Product connection object, containing the database engine and everything, defined in client_db_access_server. For each product currently connected, such an object exists.

self itself is the Thrift handler, which is constructed per-API-call in do_POST of the server handler, which gives one of these Product instances as argument.

Xazax-hun · 2017-08-10T12:41:08Z

libcodechecker/server/product_db_access_handler.py

+                if not server_product.connected:
+                    server_product.connect()
+
+                name = base64.b64encode(prod.display_name.encode('utf-8')) \


I see this pattern duplicated. Wouldn't it be better to always store a name instead of always checking for this being None?

Xazax-hun · 2017-08-10T12:48:49Z

libcodechecker/server/config_db_model.py

+    __tablename__ = 'products'
+
+    __table_args__ = (
+        UniqueConstraint('endpoint'),


If we expect endpoint to be unique, shouldn't that be the primary key instead of having a separate autoincrement id?

I would rather have an ID as primary key in the product_admin table, and various other permission tables that are upcoming. Strings as primary keys are a problematic subject... for example, MSSQL does not allow foreign keying onto a string (personal experience from half a year ago).

Also, while endpoint is expected to be unique, it isn't expected to always stay the same.

Xazax-hun · 2017-08-10T12:49:41Z

libcodechecker/server/config_db_model.py

+from sqlalchemy.ext.declarative import declarative_base
+from sqlalchemy.orm import *
+
+CC_META = MetaData(naming_convention={


Shouldn't we move this to a common place? (Maybe even Base)

You can not move Base to a common place, at it is the base class of this ORM hierarchy (and run_db_model has its own). And the two hierarchies are separate, they cannot be joined by the same base class. It would result in the database creation always creating both schemas in every database, which eventually prohibits every chance of migration.

Xazax-hun · 2017-08-10T12:51:33Z

libcodechecker/server/config_database.py

@@ -4,41 +4,37 @@
 #   License. See LICENSE.TXT for details.
 # -------------------------------------------------------------------------
 """
-Database server handling.
+Database server handling for the server's configuration database.


I am still a bit uncomfortable with duplicating this whole file. Could you explain why is it better than instantiating the database objects multiple times?

Because it uses entirely different semantics in subtle places. First of all, the used schema is imported from a different place. Second of all, failure to instantiate the config connection is a critical condition that brings down the entire server -- failure to instantiate a product connection is only an error. Config connection is initiated from the command-line, whereas product connections are created from connection strings.

From above, the two files look very similar. But there are many subtle differences. Just as how we have duplicated the common parts in the schema (such as the Metatable), and duplicate the migration environment's common code, I think it's much better if the two database connection systems are separated on the package name.

But we could pass the schema as an argument, it could always raise an exception and in case the exception is from the product database, the caller could bring down the server. And we have utilities to convert between connection strings and command line arguments. In this case, my gut feeling is that the need to maintain the code twice is worse than abstracting these away.

However, I think, it would make debugging harder. I ran into a situation while creating this patch where it was very useful that in the stack trace it was obvious which handler was used in a certain case.

If we were only passing by arguments, this information would be lost. I think there is a divide between us about this, because I am very much for keeping these things separate. (Originally, they were even in separate folders under libcodechecker.server but I did not want to have import strings that contain 3-4 elements in their path.)

My main concern, this way we always need to remember to change the code in multiple places. The information you mentioned is not lost, you can always check how the database engine was created (or add debug prints).

Okay, I'll do this, most likely tomorrow.

gyorb · 2017-08-24T14:11:28Z

api/products.thrift

+  3:          i32    port,
+  4:          string username_b64,
+  5: optional string password_b64,   // Database password is NOT sent server->client!
+  6:          string database


A comment would be nice what could be the values of it.

gyorb · 2017-08-24T14:20:56Z

api/products.thrift

+typedef list<ProductConfiguration> ProductConfigurations
+
+/* Product carries data to the end user's product list and tasks. */
+struct Product {


Shouldn't we use a more descriptive name here? There is access control data for a specific user and global product data mixed together.
Should we separate this information?

This struct is used (akin to RunData) to query the information for listing products (and checking their accessibility). Separating this would result in the client (both the GUI and the CLI) having to send multiple API requests to figure out if a product exists.

(In this case, from the client's perspective, exists means: the product endpoint is registered in the configuration (with some extra fluff such as displayed name and ID) + the server can connect to the database + the user has rights to access the product. If any of this doesn't hold, the clients will report an appropriate error, as no runs can be accessed in the product.)

Perhaps the next patch will remove ąccessible. It is not fleshed out yet, only a placeholder exists so the GUI can properly present a 🔒 symbol. Currently, the API handler for getProducts() hardcodes accessible=True.

I prefer api-s which check only one thing and not multiple things. Is it really that much of an overhead if multiple api calls are made (2-3 extra?)

For each product, this would mean the following:

Construction of a scope-specific permission configuration with the individual product's ID

Encoding this data and making an API call

The server opening a session on the configuration database and querying whether the user has rights to view the product

This might incur more than one query due to permission inheritance

Creating the response and sending it back

The UI handling this

For each product. Which we don't know how many will be.

This will be almost as bad if for each Run we would individually query the number of reports for each detection status. AKA:

getDetectionStatusForRun(run_id, NEW)

gDSFR(id, UNRESOLVED)

...

For each run.

Whereas this data (especially whether or not the database connection to the product could be made) is readily available at product listing, and is an integral part of getting user-facing information about the product.

@gyorb You yourself had to make #844 to speed up the queries. Your comment here states the exact opposite of what you seemed to have to do in #844 to make the system faster.

Yes I understand this, but I still think clean api is better which does/checks only one thing.
I did not sad we have to change it I've just sad I prefer not to mix multiple types of checks
"product endpoint exists" and "the user has access rights to it" into one api call.
On my query speed up I did not change the behavior of the api call or merged together with another just to perform better. I've improved an already existing api call which does only one thing.

gyorb · 2017-08-24T14:23:07Z

api/shared.thrift

@@ -4,6 +4,8 @@
 //   License. See LICENSE.TXT for details.
 // -------------------------------------------------------------------------

+const string API_VERSION = '6.0'


Is there any other data what we share between the new product api, the report server api and the auth api?

I don't think so. I think this change here was a result of a discussion with the team in person to not have separate API versions for every endpoint.

Ok, I've just asked it because there are multiple structs in this file which should be moved to the report server thrift file, if they are used only there.

gyorb · 2017-08-24T14:37:10Z

docs/products.md

+
+optional arguments:
+  -h, --help            show this help message and exit
+  -n DISPLAY_NAME, --name DISPLAY_NAME


I think it could be misleading if the display name is changed to a different value. How can I get the product name for a given display name?

The display name is what appears in the title bar when you are viewing the product's runs. It can contain spaces and special characters.

CodeChecker cmd products list shows the product's endpoint (and so does your browser's address bar when viewing the runs). The endpoint will have heavy restrictions imposed on its format, such as only alphanumeric characters, not many special chars, certain keywords will be prohibited, etc.

Access in the command-line is done through the endpoint (although cmd products list shows the displayed name). The display name is purely cosmetic, but the Web GUI filter filters based on display name. Think of it as run ID vs run name. Run name is purely cosmetic, as API access and database integrity is done through the ID.

I still think that this can be a bit confusing that if I wan to list the product I get the displayed name but if I want to interact with a product I have to use the product name in the command line.
What do you think @dkrupp, @Xazax-hun ?

Product endpoint, not name. Those are two different things, as explained earlier.

gyorb · 2017-08-24T14:41:20Z

docs/products.md

+  Values of these arguments are ignored, unless '--postgresql' is specified!
+  The database specified here must exist, and be connectible by the server.
+
+  --dbaddress DBADDRESS, --db-host DBADDRESS


Should this go directly to the database? The centrally started server could not do this, through the api?

So in this case why do we need the configuration database connection arguments here?

I don't fully understand the question. This utility is cmd products add, which essentially calls the addProduct API function.

The run database could be anywhere, just like how the configuration database could be hosted on an arbitrary machine. The server configuration is modular in the sense that you can connect any run database (to which you have valid credentials for, of course).

Okay, I think I understand my own confusion, and hopefully yours, too. These are NOT the configuration database options here. These are the database options for the run database.

CodeChecker cmd products add \ # This says which server to connect to. # The server is already running and has a configuration database mounted through # its cmdline arguments. --host codechecker.mydomain.com --port 9999 \ MyProduct --name "A fancy product" \ --postgresql --dbaddress fancyproduct.mycompany.net \ # These are the arguments where the RUN database is located. --dbport 5432 --dbuser whisperity

Ok, thanks for clearing it up!

gyorb · 2017-08-24T15:31:06Z

libcodechecker/libclient/client.py

@@ -64,9 +62,6 @@ def handle_auth(host, port, username, login=False):
        if auth_token and handshake.sessionStillActive:
            LOG.info("You are already logged in.")
            return
-        else:
-            LOG.info("Server requires authentication to access. Please use "


Do we provide the information to the user somewhere else?

Yes. When the user tries to log in, they get this on the cmd:

$ CodeChecker cmd runs [ERROR] [18:59] - Access denied. This server requires authentication. [ERROR] [18:59] - Please log in onto the server using 'CodeChecker cmd login'.

This message was removed so that typing in CodeChecker cmd login does not prompt like this:

$ CodeChecker cmd login [19:00] - Server requires authentication to access. Please use 'CodeChecker cmd login' to authenticate. [19:00] - Logging in using credentials from command line... Please provide password for user 'whisperity'

(The first line is removed. It is superfluous to tell the user to use [command] to authenticate right when they are actually using [command].)

gyorb · 2017-08-30T09:20:09Z

alembic.ini

+# are written from script.py.mako
+# output_encoding = utf-8
+
+sqlalchemy.url = postgres://codechecker@localhost:5432/codechecker


Shouldn't we use codechecker_cfg database name so it will not be the same as the current database name used to store the reports?

Are these values used anywhere? When the database is started in PostgreSQL mode, the user has to explicitly specify the config database's name.

gyorb · 2017-08-30T09:21:38Z

api/products.thrift

+  3:          i32    port,
+  4:          string username_b64,
+  5: optional string password_b64,   // Database password is NOT sent server->client!
+  6:          string database


Any new comments on this?

gyorb · 2017-08-30T09:22:23Z

api/products.thrift

+  1:          string engine,         // The database engine, such as "sqlite" or "postgresql".
+  2:          string host,
+  3:          i32    port,
+  4:          string username_b64,


What if the just put a comment that this value should be b46 encoded?

gyorb · 2017-08-30T09:27:09Z

api/products.thrift

+typedef list<ProductConfiguration> ProductConfigurations
+
+/* Product carries data to the end user's product list and tasks. */
+struct Product {


I prefer api-s which check only one thing and not multiple things. Is it really that much of an overhead if multiple api calls are made (2-3 extra?)

gyorb · 2017-08-30T09:41:35Z

docs/products.md

+
+optional arguments:
+  -h, --help            show this help message and exit
+  -n DISPLAY_NAME, --name DISPLAY_NAME


I still think that this can be a bit confusing that if I wan to list the product I get the displayed name but if I want to interact with a product I have to use the product name in the command line.
What do you think @dkrupp, @Xazax-hun ?

gyorb · 2017-08-30T10:38:25Z

libcodechecker/cmd/product_client.py

+    if success:
+        LOG.info("Product added successfully.")
+    else:
+        LOG.error("Adding the product has failed.")


sys.exit(1) if adding fails?

gyorb · 2017-08-30T10:39:15Z

libcodechecker/cmd/product_client.py

+    if success:
+        LOG.info("Product removed.")
+    else:
+        LOG.error("An error occurred in product removal.")


sys.exit(1) here too if we fail to remove the product?

gyorb · 2017-08-30T10:48:07Z

libcodechecker/libclient/client.py

+    try:
+        host, port, product_name = split_product_url(product_url)
+    except:
+        LOG.error("Couldn't understand the product URL!")


I think "Wrong product URL format was provided" would be a better error message.

gyorb · 2017-08-30T10:56:02Z

libcodechecker/libhandlers/cmd.py

+                           help="Name of the database to use.")
+
+        def __handle(args):
+            """Custom handler for 'add' so custom error messages can be


Is is worth to create a separate handle just because we do not want to capture the exceptions in main?

This same code resides in server.py too. This is because main() everywhere deals with a proper argparse.Namespace. This __handle is basically an extension over argparse's argument parsing routine, as there are some special rules that can NOT be expressed with argparse's existing tooling.

gyorb · 2017-08-30T10:57:23Z

libcodechecker/libhandlers/plist.py

@@ -8,6 +8,8 @@
 functionality of 'store' and 'parse'.
 """

+# TODO: This command should be removed as a whole.


An issue would be better than a TODO comment here.

#745's first action point is this removal.

gyorb · 2017-08-30T14:06:25Z

libcodechecker/server/client_db_access_handler.py

@@ -26,6 +26,7 @@
 from sqlalchemy import func

 import shared
+from shared import constants as shared_constants


shared.constants could be used if only shared is imported its length is the same as shared_constants

- Introduce new configuration database schema, and database connection handling to this new database - Added basic API for product management - PostgreSQL server is not started automatically anymore - Products have a distinct endpoint (localhost:8001/ProductName/CodeCheckerService) in the request, and the requests are routed to the individual products based on this. - If server is started with SQLite configuration database and this is a brand new start, automatically configure a 'Default' product in the same folder. - Store a connection failure state for products that could not be connected to at server start. As no other packages manage and use database connections anymore, put the ORM and the run database handling into libcodechecker.server. This commit breaks the tests. It will be fixed later on. [ci skip]

…multiple database engines at the same time [ci skip]

- The CodeChecker Viewer now by default opens a product list. Users can select the product they want to view and will be routed to the run list. - If only ONE products exist on the server, simply opening http://host:port will automatically redirect the user to the run list of this single product. - This will only happen if this only product is properly connected. - Added a button to run lists to explicitly return to the product list page - The current product's name is now shown in the run list's title bar [ci skip]

- 'store' and 'cmd' now take --url PRODUCT_URL where applicable, instead of --host and --port. This URL defaults to the default product created by a brand new 'server' start, so can be omitted by end-users if they only work a single product. - Added 'cmd products' to handle basic product-related configuration, such as listing, adding and removing products - Added the appropriate server-side code to on-the-fly add and disconnect product databases Tests will be updated in a later commit. [ci skip]

- Tests now start a single server at the beginning of testing - PostgreSQL must be started prior the tests being ran - Each test suite registers a new product with a unique endpoint, and uses that particular run database to execute the tests that were already implemented. - These databases are dropped when the test is over

whisperity added this to the 6.0 pre2 milestone Aug 7, 2017

dkrupp mentioned this pull request Aug 7, 2017

Multiple products on the same server (multi-database system) [142h] #691

Closed

29 tasks

Xazax-hun suggested changes Aug 7, 2017

View reviewed changes

whisperity requested a review from gyorb August 7, 2017 12:15

whisperity force-pushed the product_management branch 4 times, most recently from b5dde20 to 79284a3 Compare August 7, 2017 14:48

Xazax-hun reviewed Aug 7, 2017

View reviewed changes

csordasmarton reviewed Aug 7, 2017

View reviewed changes

whisperity force-pushed the product_management branch 2 times, most recently from fe2e5b2 to 7a459ba Compare August 8, 2017 08:38

Xazax-hun reviewed Aug 10, 2017

View reviewed changes

whisperity force-pushed the product_management branch 2 times, most recently from 5e16bea to 2115e35 Compare August 10, 2017 14:48

whisperity added the WIP 💣 Work In Progress label Aug 10, 2017

whisperity force-pushed the product_management branch 6 times, most recently from 87d323e to e167bb2 Compare August 11, 2017 09:43

whisperity mentioned this pull request Aug 11, 2017

Case insensitive run name search #794

Merged

whisperity removed the WIP 💣 Work In Progress label Aug 11, 2017

whisperity force-pushed the product_management branch from e167bb2 to 0c71680 Compare August 14, 2017 10:09

whisperity force-pushed the product_management branch from b45da9b to 5171df5 Compare August 19, 2017 00:41

whisperity force-pushed the version6 branch from 1c1b6ad to a749f84 Compare August 21, 2017 07:18

whisperity force-pushed the product_management branch 5 times, most recently from 4ac35ca to d2de71b Compare August 23, 2017 15:37

gyorb suggested changes Aug 24, 2017

View reviewed changes

whisperity force-pushed the product_management branch 2 times, most recently from a2d7550 to cf8ec7d Compare August 25, 2017 15:48

gyorb suggested changes Aug 30, 2017

View reviewed changes

whisperity force-pushed the product_management branch 2 times, most recently from 023e259 to 2559306 Compare August 30, 2017 11:55

gyorb suggested changes Aug 30, 2017

View reviewed changes

whisperity force-pushed the product_management branch from 2559306 to 22ba5ea Compare August 30, 2017 15:05

gyorb approved these changes Aug 31, 2017

View reviewed changes

whisperity force-pushed the product_management branch from 22ba5ea to b4db8fe Compare August 31, 2017 12:15

whisperity mentioned this pull request Aug 31, 2017

Product management (2nd part), permission system #857

Merged

gyorb mentioned this pull request Sep 1, 2017

Fix relative document links in the HOWTO #852

Merged

whisperity force-pushed the product_management branch from b4db8fe to d087ed9 Compare September 1, 2017 13:12

whisperity added 6 commits September 1, 2017 15:40

Refactor database handlers to use a shared code, and support running …

ac3cc0d

…multiple database engines at the same time [ci skip]

Add documentation for the product system

f484490

whisperity force-pushed the product_management branch from d087ed9 to f484490 Compare September 1, 2017 13:47

gyorb merged commit f0cf861 into Ericsson:version6 Sep 1, 2017

csordasmarton mentioned this pull request Sep 4, 2017

Use the same naming convention at CodeChecker javascript services #869

Merged

whisperity deleted the product_management branch September 4, 2017 15:48

Product management (1st part) #773

Product management (1st part) #773

Conversation

whisperity commented Aug 7, 2017

❗ Backward compatibility has broken

Changes summed up

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity Aug 8, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Xazax-hun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity Aug 8, 2017 •

edited

Loading