Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time #18

Closed
iainelder opened this issue Mar 24, 2021 · 20 comments · Fixed by #28
Assignees

Comments

@iainelder
Copy link

I'm trying to import a second AWS account to my introspector database.

The first account is the organization management account. The second acount is a member of that organization.

I created the introspector IAM user in the member acount and set its credentials via the AWS_PROFILE environment variable.

I got the following error:

[2021-03-24 21:11:47,038] {aws.py:106} ERROR - exception caught in map
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
self.dialect.do_execute(
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.

Error in context:

$ ./introspector account aws import
[...truncate IAM credential report error...]
[2021-03-24 21:11:40,363] {resource.py:57} WARNING - Missing parent for relation arn:aws:ec2:us-west-2:012345678901:security-group/sg-00000000 in arn:aws:organizations::012345678901:account/o-abcdefghij/098765432109 (1)
[...truncate 100+ warnings about "Missing parent for relation"...]
[2021-03-24 21:11:44,653] {resource.py:57} WARNING - Missing parent for relation arn:aws:ec2:us-west-2:012345678901:vpc/vpc-00000000 in arn:aws:organizations::012345678901:account/o-abcdefghij/098765432109 (1)
[2021-03-24 21:11:47,038] {aws.py:106} ERROR - exception caught in map
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/introspector/cli/account/aws.py", line 103, in import_aws_cmd
    refresh_views(db, reloaded_import_job.provider_account_id)
  File "/app/introspector/bootstrap_db.py", line 157, in refresh_views
    result = db.execute(query)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1277, in execute
    return self._connection_for_bind(bind, close_with_result=True).execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1095, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.CardinalityViolation) ON CONFLICT DO UPDATE command cannot affect row a second time
HINT:  Ensure that no rows proposed for insertion within the same command have duplicate constrained values.

[SQL: INSERT INTO aws_config_configurationrecorder (
  _id,
  uri,
  provider_account_id,
  rolearn,
  allsupported,
  includeglobalresourcetypes,
  resourcetypes,
  name,
  laststarttime,
  laststoptime,
  recording,
  laststatus,
  lasterrorcode,
  lasterrormessage,
  laststatuschangetime,
  _iam_role_id,_account_id
)
SELECT
  R.id AS _id,
  R.uri,
  R.provider_account_id,
  rolearn.attr_value #>> '{}' AS rolearn,
  (allsupported.attr_value #>> '{}')::boolean AS allsupported,
  (includeglobalresourcetypes.attr_value #>> '{}')::boolean AS includeglobalresourcetypes,
  resourcetypes.attr_value::jsonb AS resourcetypes,
  name.attr_value #>> '{}' AS name,
  (TO_TIMESTAMP(laststarttime.attr_value #>> '{}', 'YYYY-MM-DD"T"HH24:MI:SS')::timestamp at time zone '00:00') AS laststarttime,
  (TO_TIMESTAMP(laststoptime.attr_value #>> '{}', 'YYYY-MM-DD"T"HH24:MI:SS')::timestamp at time zone '00:00') AS laststoptime,
  (recording.attr_value #>> '{}')::boolean AS recording,
  laststatus.attr_value #>> '{}' AS laststatus,
  lasterrorcode.attr_value #>> '{}' AS lasterrorcode,
  lasterrormessage.attr_value #>> '{}' AS lasterrormessage,
  (TO_TIMESTAMP(laststatuschangetime.attr_value #>> '{}', 'YYYY-MM-DD"T"HH24:MI:SS')::timestamp at time zone '00:00') AS laststatuschangetime,
  
    _iam_role_id.target_id AS _iam_role_id,
    _account_id.target_id AS _account_id
FROM
  resource AS R
  INNER JOIN provider_account AS PA
    ON PA.id = R.provider_account_id
  LEFT JOIN resource_attribute AS rolearn
    ON rolearn.resource_id = R.id
    AND rolearn.type = 'provider'
    AND lower(rolearn.attr_name) = 'rolearn'
  LEFT JOIN resource_attribute AS allsupported
    ON allsupported.resource_id = R.id
    AND allsupported.type = 'provider'
    AND lower(allsupported.attr_name) = 'allsupported'
  LEFT JOIN resource_attribute AS includeglobalresourcetypes
    ON includeglobalresourcetypes.resource_id = R.id
    AND includeglobalresourcetypes.type = 'provider'
    AND lower(includeglobalresourcetypes.attr_name) = 'includeglobalresourcetypes'
  LEFT JOIN resource_attribute AS resourcetypes
    ON resourcetypes.resource_id = R.id
    AND resourcetypes.type = 'provider'
    AND lower(resourcetypes.attr_name) = 'resourcetypes'
  LEFT JOIN resource_attribute AS name
    ON name.resource_id = R.id
    AND name.type = 'provider'
    AND lower(name.attr_name) = 'name'
  LEFT JOIN resource_attribute AS laststarttime
    ON laststarttime.resource_id = R.id
    AND laststarttime.type = 'provider'
    AND lower(laststarttime.attr_name) = 'laststarttime'
  LEFT JOIN resource_attribute AS laststoptime
    ON laststoptime.resource_id = R.id
    AND laststoptime.type = 'provider'
    AND lower(laststoptime.attr_name) = 'laststoptime'
  LEFT JOIN resource_attribute AS recording
    ON recording.resource_id = R.id
    AND recording.type = 'provider'
    AND lower(recording.attr_name) = 'recording'
  LEFT JOIN resource_attribute AS laststatus
    ON laststatus.resource_id = R.id
    AND laststatus.type = 'provider'
    AND lower(laststatus.attr_name) = 'laststatus'
  LEFT JOIN resource_attribute AS lasterrorcode
    ON lasterrorcode.resource_id = R.id
    AND lasterrorcode.type = 'provider'
    AND lower(lasterrorcode.attr_name) = 'lasterrorcode'
  LEFT JOIN resource_attribute AS lasterrormessage
    ON lasterrormessage.resource_id = R.id
    AND lasterrormessage.type = 'provider'
    AND lower(lasterrormessage.attr_name) = 'lasterrormessage'
  LEFT JOIN resource_attribute AS laststatuschangetime
    ON laststatuschangetime.resource_id = R.id
    AND laststatuschangetime.type = 'provider'
    AND lower(laststatuschangetime.attr_name) = 'laststatuschangetime'
  LEFT JOIN (
    SELECT
      _aws_iam_role_relation.resource_id AS resource_id,
      _aws_iam_role.id AS target_id
    FROM
      resource_relation AS _aws_iam_role_relation
      INNER JOIN resource AS _aws_iam_role
        ON _aws_iam_role_relation.target_id = _aws_iam_role.id
        AND _aws_iam_role.provider_type = 'Role'
        AND _aws_iam_role.service = 'iam'
    WHERE
      _aws_iam_role_relation.relation = 'acts-as'
  ) AS _iam_role_id ON _iam_role_id.resource_id = R.id
  LEFT JOIN (
    SELECT
      _aws_organizations_account_relation.resource_id AS resource_id,
      _aws_organizations_account.id AS target_id
    FROM
    (
      SELECT
        _aws_organizations_account_relation.resource_id AS resource_id
      FROM
        resource_relation AS _aws_organizations_account_relation
        INNER JOIN resource AS _aws_organizations_account
          ON _aws_organizations_account_relation.target_id = _aws_organizations_account.id
          AND _aws_organizations_account.provider_type = 'Account'
          AND _aws_organizations_account.service = 'organizations'
      WHERE
        _aws_organizations_account_relation.relation = 'in'
      GROUP BY _aws_organizations_account_relation.resource_id
      HAVING COUNT(*) = 1
    ) AS unique_account_mapping
    INNER JOIN resource_relation AS _aws_organizations_account_relation
      ON _aws_organizations_account_relation.resource_id = unique_account_mapping.resource_id
    INNER JOIN resource AS _aws_organizations_account
      ON _aws_organizations_account_relation.target_id = _aws_organizations_account.id
      AND _aws_organizations_account.provider_type = 'Account'
      AND _aws_organizations_account.service = 'organizations'
    WHERE
        _aws_organizations_account_relation.relation = 'in'
  ) AS _account_id ON _account_id.resource_id = R.id
  WHERE
  R.provider_account_id = 1 AND 
  PA.provider = 'aws'
  AND R.provider_type = 'ConfigurationRecorder'
  AND R.service = 'config'
ON CONFLICT (_id) DO UPDATE
SET
    rolearn = EXCLUDED.rolearn,
    allsupported = EXCLUDED.allsupported,
    includeglobalresourcetypes = EXCLUDED.includeglobalresourcetypes,
    resourcetypes = EXCLUDED.resourcetypes,
    name = EXCLUDED.name,
    laststarttime = EXCLUDED.laststarttime,
    laststoptime = EXCLUDED.laststoptime,
    recording = EXCLUDED.recording,
    laststatus = EXCLUDED.laststatus,
    lasterrorcode = EXCLUDED.lasterrorcode,
    lasterrormessage = EXCLUDED.lasterrormessage,
    laststatuschangetime = EXCLUDED.laststatuschangetime,
    _iam_role_id = EXCLUDED._iam_role_id,
    _account_id = EXCLUDED._account_id]
(Background on this error at: http://sqlalche.me/e/f405)
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/introspector.py", line 5, in <module>
    run_cli()
  File "/app/introspector/cli/__init__.py", line 60, in run_cli
    cli()
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/app/introspector/cli/account/aws.py", line 112, in import_aws_cmd
    db.commit()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1036, in commit
    self.transaction.commit()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 503, in commit
    self._prepare_impl()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 482, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2479, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2617, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 2577, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute
    persistence.save_obj(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 230, in save_obj
    _emit_update_statements(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 995, in _emit_update_statements
    c = cached_connections[connection].execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1095, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.InternalError: (psycopg2.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block

[SQL: UPDATE import_job SET end_date=%(end_date)s, error_details=%(error_details)s WHERE import_job.id = %(import_job_id)s]
[parameters: {'end_date': datetime.datetime(2021, 3, 24, 21, 11, 47, 41410, tzinfo=datetime.timezone.utc), 'error_details': '["Traceback (most recent call last):\\n  File \\"/usr/local/lib/python3.9/site-packages/sqlalchemy/engine/base.py\\", line 1245, in _execute_context\ ... (9259 characters truncated) ... \n    _iam_role_id = EXCLUDED._iam_role_id,\\n    _account_id = EXCLUDED._account_id]\\n(Background on this error at: http://sqlalche.me/e/f405)\\n"]', 'import_job_id': 5}]
(Background on this error at: http://sqlalche.me/e/2j85)

I'm using Docker image e934e4551c83 and release v0.1.2.

@iainelder
Copy link
Author

iainelder commented Mar 24, 2021

I tried this after reading that support for sub-accounts was implemented in #4. I didn't find any documentation for it, though. Let me know if I'm doing something wrong here.

@gsoltis gsoltis self-assigned this Mar 24, 2021
@gsoltis
Copy link
Contributor

gsoltis commented Mar 24, 2021

Hmm, I'll have to look into this. There are still a few sharp edges on sub-accounts.

@iainelder
Copy link
Author

iainelder commented Mar 24, 2021

To reproduce:

  • docker-compose down --volumes
  • docker-compose up -d
  • ./introspector init
  • AWS_PROFILE=member ./introspector account aws import
  • AWS_PROFILE=mgmt ./introspector account aws import

If I swap the order of the accounts I get the same error on the second import.

@gsoltis
Copy link
Contributor

gsoltis commented Mar 25, 2021

I was not able to reproduce this with the latest release, although it could be the result of a difference with our respective data sets. Can you try with the latest, and if it's still happening, we can dig in?

@iainelder
Copy link
Author

Yes, I see you just made a new release:

https://github.com/goldfiglabs/introspector/releases/tag/v0.1.3

I think I'll have time to try it over the weekend.

@iainelder
Copy link
Author

I get the same error when using the latest version.

It looks like the problem is in the aws_config_configurationrecorder table.

[SQL: INSERT INTO aws_config_configurationrecorder (

I tried to rule that out by setting the service flag to import just the ec2 service.

./introspector account aws import --service ec2

But I got the same error result.

I didn't see any examples of how to use the service flag, so I might be doing that wrong.

How can we dig into this?

@gsoltis
Copy link
Contributor

gsoltis commented Mar 27, 2021

Some quick background first: introspector has a series of tables for doing an import. First is raw_import which is mostly the json from botocore calls. Second is resource, resource_attribute, and resource_relation, which are generic mappings from the raw json to resources, their attributes, and their relations, and finally there are the aws_ tables, which are specific resources with columns and foreign keys for attributes and relations.

After running an import, even if it's just a single service (the --service flag definitely needs documentation...), introspector will resync the aws_ tables from the resource tables. What is likely happening is that the query to do that has a bug and is producing multiple conflicting rows for aws_config_configrecorder. The query for config recorder is here.

We can verify this by running the SELECT portion of the query in that file (lines 19-134) against your database. We're looking for duplicate _id values or uri, provider_account_id values. My guess is that the INNER JOINs to get the account id may be the culprit.

Can you post the results of that query (or email them) ?

@iainelder
Copy link
Author

We're looking for duplicate _id values or uri, provider_account_id values.

I haven't had time to pick apart the query to discover why, but I found a lot of duplicates.

Read on for details.

The analysis is performed after starting with a fresh introspector database and reproducing the import error.

raw_import

The relation raw_import is empty.

resource

The relation resource has two rows for the config service. Presumably they reprresent the configuration recorder in each account.

Surprisingly the path contains the organization ID instead of the AWS account ID.

The uri and provider_account_id values are the same for the two rows.

SELECT * FROM resource WHERE service = 'config';
 id |                path                 |                   uri                    |  name   | provider_account_id |     provider_type     | service | category 
----+-------------------------------------+------------------------------------------+---------+---------------------+-----------------------+---------+----------
  6 | o-abcdefghij$012345678901$eu-west-1 | configurationRecorders/eu-west-1/default | default |                   1 | ConfigurationRecorder | config  | 
  8 | o-abcdefghij$012345678901$us-east-1 | configurationRecorders/us-east-1/default | default |                   1 | ConfigurationRecorder | config  | 
(2 rows)

resource_attribute

The relation resource_attribute is empty.

resource_relation

The relation resource_relation has four rows for each the configuration recorders. It looks like the intent is to describe the related IAM roles (relation="acts-as") and AWS accounts (relation="in").

It looks like there is some duplication here. Each recorder is "in" the same account twice. Each recorder "acts-as" two IAM roles, and those roles are in different accounts.

The provider_account_id is the same for all rows.

SELECT * FROM resource_relation WHERE resource_ID IN (6, 8);
 id  | resource_id | relation | target_id |                                                                                       raw                                                                                       | provider_account_id 
-----+-------------+----------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------
  14 |           6 | in       |       139 | {"target": "arn:aws:organizations::012345678901:account/o-abcdefghij/012345678901", "relation": "in", "resource": "configurationRecorders/eu-west-1/default", "attributes": []} |                   1
  15 |           6 | acts-as  |       151 | {"target": "arn:aws:iam::012345678901:role/Config-Recorder", "relation": "acts-as", "resource": "configurationRecorders/eu-west-1/default", "attributes": []}                   |                   1
 494 |           6 | in       |       135 | {"target": "arn:aws:organizations::012345678901:account/o-abcdefghij/098765432109", "relation": "in", "resource": "configurationRecorders/eu-west-1/default", "attributes": []} |                   1
 495 |           6 | acts-as  |       365 | {"target": "arn:aws:iam::098765432109:role/Config-Recorder", "relation": "acts-as", "resource": "configurationRecorders/eu-west-1/default", "attributes": []}                   |                   1
  18 |           8 | in       |       139 | {"target": "arn:aws:organizations::012345678901:account/o-abcdefghij/012345678901", "relation": "in", "resource": "configurationRecorders/us-east-1/default", "attributes": []} |                   1
  19 |           8 | acts-as  |       151 | {"target": "arn:aws:iam::012345678901:role/Config-Recorder", "relation": "acts-as", "resource": "configurationRecorders/us-east-1/default", "attributes": []}                   |                   1
 590 |           8 | in       |       135 | {"target": "arn:aws:organizations::012345678901:account/o-abcdefghij/098765432109", "relation": "in", "resource": "configurationRecorders/us-east-1/default", "attributes": []} |                   1
 591 |           8 | acts-as  |       365 | {"target": "arn:aws:iam::098765432109:role/Config-Recorder", "relation": "acts-as", "resource": "configurationRecorders/us-east-1/default", "attributes": []}                   |                   1
(8 rows)

aws_config_configuration_recorder

The relation aws_config_configurationrecorder has two rows. I'm not sure how to interpret this data.

The _id column coincidentally has the same values as the resource relation.

The uri and provider_account_id is the same for both rows.

In fact the two rows are identical except for the values in _id, includeglobalresources, laststarttime, and laststatuschangetime.

SELECT * FROM aws_config_configurationrecorder;
 _id |                   uri                    | provider_account_id |                    rolearn                     | allsupported | includeglobalresourcetypes | resourcetypes |  name   |     laststarttime      | laststoptime | recording | laststatus | lasterrorcode | lasterrormessage |  laststatuschangetime  | _iam_role_id | _account_id 
-----+------------------------------------------+---------------------+------------------------------------------------+--------------+----------------------------+---------------+---------+------------------------+--------------+-----------+------------+---------------+------------------+------------------------+--------------+-------------
   6 | configurationRecorders/eu-west-1/default |                   1 | arn:aws:iam::012345678901:role/Config-Recorder | t            | t                          | []            | default | 2021-02-05 14:20:20+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 08:20:31+00 |          151 |         139
   8 | configurationRecorders/us-east-1/default |                   1 | arn:aws:iam::012345678901:role/Config-Recorder | t            | f                          | []            | default | 2021-02-05 14:20:10+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 09:20:22+00 |          151 |         139
(2 rows)

0074-aws_config_configurationrecorder.sql

The query outputs four rows. There are duplicates in all the columns you indicated.

The uri and the provider_account_id are the same for all rows.

The _id values 6 and 8 are repeated.

The output of the query in 0074-aws_config_configurationrecorder.sql looks like this:

 _id |                   uri                    | provider_account_id |                    rolearn                     | allsupported | includeglobalresourcetypes | resourcetypes |  name   |     laststarttime      | laststoptime | recording | laststatus | lasterrorcode | lasterrormessage |  laststatuschangetime  | _iam_role_id | _account_id 
-----+------------------------------------------+---------------------+------------------------------------------------+--------------+----------------------------+---------------+---------+------------------------+--------------+-----------+------------+---------------+------------------+------------------------+--------------+-------------
   6 | configurationRecorders/eu-west-1/default |                   1 | arn:aws:iam::098765432109:role/Config-Recorder | t            | t                          | []            | default | 2021-02-04 16:02:14+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 05:02:25+00 |          151 |            
   6 | configurationRecorders/eu-west-1/default |                   1 | arn:aws:iam::098765432109:role/Config-Recorder | t            | t                          | []            | default | 2021-02-04 16:02:14+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 05:02:25+00 |          365 |            
   8 | configurationRecorders/us-east-1/default |                   1 | arn:aws:iam::098765432109:role/Config-Recorder | t            | f                          | []            | default | 2021-02-04 16:02:17+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 09:02:27+00 |          151 |            
   8 | configurationRecorders/us-east-1/default |                   1 | arn:aws:iam::098765432109:role/Config-Recorder | t            | f                          | []            | default | 2021-02-04 16:02:17+00 |              | t         | SUCCESS    |               |                  | 2021-03-31 09:02:27+00 |          365 |            
(4 rows)

@gsoltis
Copy link
Contributor

gsoltis commented Mar 31, 2021

Got it, I should be able to dig in on this. Thanks for the data.

@iainelder
Copy link
Author

Thanks. Let me know if you need anything else. The accounts are just demo accounts, so I can freeze the configuration for now.

@gsoltis
Copy link
Contributor

gsoltis commented Mar 31, 2021

Hmm, so the culprit appears to be that each config recorder has a link to two different IAM roles (_iam_role_id), which is incorrect. This means that there are two rows in resource_relation targeting an iam role for each config recorder.

So the next question is how did that happen? If these are demo accounts, would you be comfortable sharing a pg_dump over email?

If not, probably the next step is to look for the resources 151 and 365. If those are duplicates, then something is likely messed up with mapping the IAM roles properly.

SELECT
*
FROM 
  resource AS R
  INNER JOIN resource_raw AS Raw
    ON R.id = Raw.resource_id
WHERE
  R.id IN (151, 365)

I will also keep looking into a way to reproduce the issue.

@iainelder
Copy link
Author

If these are demo accounts, would you be comfortable sharing a pg_dump over email?

Sure, I'll try to get that to you before next week.

@iainelder
Copy link
Author

@gsoltis , I just emailed you the dump.

Let me know if you need anything else.

I created the dump like this.

pg_dump --host=localhost --dbname=introspector --user=postgres --file=introspector_dump.sql
tar --create --bzip2 --file=introspector_dump.sql.tar.bzip2 -- introspector_dump.sql
rm introspector_dump.sql

You can restore the dump like this.

# Reinitialize introspector
docker-compose down --volumes
docker-compose up --detach
introspector init

# Prepare database for import
dropdb --host=localhost --user=postgres introspector
createdb --host=localhost --user=postgres introspector

# Load dump into database
tar --extract --bzip2 --file=introspector_dump.sql.tar.bzip2
psql --host=localhost --user=postgres --dbname=introspector --file=introspector_dump.sql --quiet --output=/dev/null

@iainelder
Copy link
Author

the next step is to look for the resources 151 and 365. If those are duplicates, then something is likely messed up with mapping the IAM roles properly.

I'm guessing that resources 151 and 365 are those referenced in the _iam_role_id colum of the output of query 74.

What I'm seeing is that they are not duplicates, but something stranger.

Resource 365 is not an IAM role but a VPC route table.

So, yes, something has messed up somewhere :-)

I got the following two rows from the query in your last comment. Do you get the same from my dump?


Resource 151 is an IAM role.

-[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id                  | 151
path                | o-abcdefghij$123456789012
uri                 | arn:aws:iam::123456789012:role/Config-Recorder
name                | Config-Recorder
provider_account_id | 1
provider_type       | Role
service             | iam
category            | Role
id                  | 151
source              | base
resource_id         | 151
provider_account_id | 1
raw                 | {"Arn": "arn:aws:iam::123456789012:role/Config-Recorder", "Path": "/", "Tags": [], "RoleId": "AROAW74HSJB4UE4DCLUUM", "RoleName": "Config-Recorder", "CreateDate": "2021-02-05T14:19:59+00:00", "PolicyList": [{"PolicyName": "Config-Recorder-Policy", "PolicyDocument": {"Version": "2012-10-17", "Statement": [{"Sid": "", "Action": ["s3:ListBucket", "s3:GetBucketAcl"], "Effect": "Allow", "Resource": "arn:aws:s3:::isme-baseline-audit-1"}, {"Sid": "", "Action": ["s3:PutObjectACl", "s3:PutObject"], "Effect": "Allow", "Resource": "arn:aws:s3:::isme-baseline-audit-1/config/AWSLogs/123456789012/*", "Condition": {"StringLike": {"s3:x-amz-acl": "bucket-owner-full-control"}}}, {"Sid": "", "Action": "sns:Publish", "Effect": "Allow", "Resource": ["arn:aws:sns:us-east-1:123456789012:ConfigChanges", "arn:aws:sns:eu-west-1:123456789012:ConfigChanges"]}]}}], "RoleLastUsed": {"Region": "eu-west-1", "LastUsedDate": "2021-04-04T10:27:11+00:00"}, "AttachedPolicies": [{"PolicyArn": "arn:aws:iam::aws:policy/service-role/AWS_ConfigRole", "PolicyName": "AWS_ConfigRole"}], "MaxSessionDuration": 3600, "AssumeRolePolicyDocument": {"Version": "2012-10-17", "Statement": [{"Sid": "", "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": {"Service": "config.amazonaws.com"}}]}}

Resource 365 is a VPC route table.

-[ RECORD 2 ]-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id                  | 365
path                | o-abcdefghij$098765432109$eu-west-2
uri                 | arn:aws:ec2:eu-west-2:098765432109:route-table/rtb-346b7f5c
name                | rtb-346b7f5c
provider_account_id | 1
provider_type       | RouteTable
service             | ec2
category            | 
id                  | 368
source              | base
resource_id         | 365
provider_account_id | 1
raw                 | {"Tags": [], "VpcId": "vpc-a1f7afc9", "Routes": [{"State": "active", "Origin": "CreateRouteTable", "GatewayId": "local", "DestinationCidrBlock": "172.31.0.0/16"}, {"State": "active", "Origin": "CreateRoute", "GatewayId": "igw-c3985aab", "DestinationCidrBlock": "0.0.0.0/0"}], "OwnerId": "098765432109", "Associations": [{"Main": true, "RouteTableId": "rtb-346b7f5c", "AssociationState": {"State": "associated"}, "RouteTableAssociationId": "rtbassoc-2254c749"}], "RouteTableId": "rtb-346b7f5c", "PropagatingVgws": []}

@gsoltis
Copy link
Contributor

gsoltis commented Apr 5, 2021

I found the problem. Configuration recorders do not have an arn (see config resources). So, we made one up. Unfortunately, I forgot to include the account id in the made-up uri. Your import data has two recorders in different accounts, both with the same name, and so they conflicted, resulting in some assumptions being violated.

Most resources don't have this problem, since they supply their own URIs, but some do not. I've updated the uri function to include the account id, and verified that it no longer causes a crash. Will merge soon.

@gsoltis
Copy link
Contributor

gsoltis commented Apr 5, 2021

Thanks for providing the db dump, it made tracking down the issue much faster.

@iainelder
Copy link
Author

Configuration recorders do not have an arn

Maybe it's because there can only be one recorder per account per region, but still, how strange that there is no ARN!

Your import data has two recorders in different accounts, both with the same name

Incidentally, they were created using nozaq's terraform-aws-secure-baseline module. It will create a recorder in all regions with the same name. I ran the module in multiple accounts to test AWS Config's resource aggregation.

Thanks for providing the db dump, it made tracking down the issue much faster.

No problem, happy to help!

I've updated the uri function to include the account id, and verified that it no longer causes a crash. Will merge soon.

I'll try it over here when you tell me it's ready :-)

@gsoltis
Copy link
Contributor

gsoltis commented Apr 5, 2021

Can you try with image goldfig/introspector:v2.1.4 in your docker-compose file when you get a chance?

@iainelder
Copy link
Author

Hi, @gsoltis , The new v2.1.4 is working well.

Running with account 1:

Results - Import #1
  Start:  2021-04-12 20:55:00
  End:  2021-04-12 20:57:18
  Errors:  0
  Resources added:  275
  Resources removed:  0
  Resources updated:  3
    Attributes added:  30
    Attributes removed:  0
    Attributes updated:  0

Running with account 2:

Results - Import #2
  Start:  2021-04-12 20:57:54
  End:  2021-04-12 21:00:31
  Errors:  0
  Resources added:  148
  Resources removed:  0
  Resources updated:  13
    Attributes added:  18
    Attributes removed:  0
    Attributes updated:  12

(Sorry for the delay! Had to catch up with other things.)

@gsoltis
Copy link
Contributor

gsoltis commented Apr 12, 2021

Awesome! Glad to hear it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants