Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Job filters #247

Merged
merged 8 commits into from
Dec 10, 2018
Merged

Add Job filters #247

merged 8 commits into from
Dec 10, 2018

Conversation

aarontp
Copy link
Member

@aarontp aarontp commented Sep 7, 2018

Adds Job whitelist/blacklists. This is used when either starting the Turbinia Server, or per-request from the client.

@aarontp
Copy link
Member Author

aarontp commented Sep 7, 2018

I'm not sure I agree with codefactor's "complexity" rating here. The method seems fairly straightforward to me. I'm open to suggestions for how to handle this.

@@ -350,13 +350,15 @@ class TurbiniaServer(object):

Attributes:
task_manager (TaskManager): An object to manage turbinia tasks.
jobs_blacklist (list): Jobs we will exclude from running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List of what? Strings? Class names? Instances? Please add this to the docstring.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

and all Jobs will be returned if both are specified.

Args:
jobs_blacklist (list): Jobs that will be excluded from running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, please indicate the type of things in these lists

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

TomcatAnalysisJob()]

if jobs_whitelist and jobs_blacklist:
log.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just .info? I would have thought Warn, at least. I also would prefer raising here, rather than trying to continue - what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on raising and let the user know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think raising is a better idea as well. Done.

@Onager Onager mentioned this pull request Sep 10, 2018
turbinia/jobs/__init__.py Outdated Show resolved Hide resolved
TomcatAnalysisJob()]

if jobs_whitelist and jobs_blacklist:
log.info(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on raising and let the user know.

turbinia/task_manager.py Outdated Show resolved Hide resolved
Copy link
Member Author

@aarontp aarontp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I addressed the comments and merged this in to work with the new Job manager. PTAL. Thanks!

@@ -350,13 +350,15 @@ class TurbiniaServer(object):

Attributes:
task_manager (TaskManager): An object to manage turbinia tasks.
jobs_blacklist (list): Jobs we will exclude from running
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

turbinia/jobs/__init__.py Outdated Show resolved Hide resolved
and all Jobs will be returned if both are specified.

Args:
jobs_blacklist (list): Jobs that will be excluded from running
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

TomcatAnalysisJob()]

if jobs_whitelist and jobs_blacklist:
log.info(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think raising is a better idea as well. Done.

turbinia/task_manager.py Outdated Show resolved Hide resolved
@aarontp
Copy link
Member Author

aarontp commented Nov 14, 2018

Sorry, I'll fix the conflicts first.

@aarontp
Copy link
Member Author

aarontp commented Nov 15, 2018

Ok, PTAL. Thanks!

Copy link
Contributor

@Onager Onager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one thing I think should be different.


class JobsManager(object):
"""The jobs manager."""

_job_classes = {}

@classmethod
def FilterJobNames(
cls, jobs, jobs_blacklist=None, jobs_whitelist=None, objects=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a big fan of the "objects" arg here, and it doesn't seem to be used so I just removing it.

I recommend two different methods if you want this functionality in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I felt it was a bit skeevy too. It was actually used, but I refactored it into a separate function, and I think this is better.

PTAL

"""

def __init__(self):
def __init__(self, jobs_blacklist=None, jobs_whitelist=None):
"""Initialize Turbinia Server."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Initializes" + add args section

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah, it was documented just above in the class attributes, but I realized that they are not actually class attributes, so I moved it here.

jobs_whitelist and jobs_blacklist must not be specified at the same time.

Args:
jobs (list[str]): The names of the jobs to filter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to job_names for clarity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

jobs_whitelist and jobs_blacklist must not be specified at the same time.

Args:
jobs (list[TurbiniaJob]): The names of the jobs to filter.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the docstring here - this is the jobs themselves not their names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""Does setup of Task manager and its dependencies.

Args:
jobs_blacklist (list): Jobs that will be excluded from running
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Optional[list[str]), not (list)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Can you point me to a specification for this? Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

self._backend_setup(*args, **kwargs)
# TODO(aarontp): Consider instantiating a job per evidence object
job_names = jobs_manager.JobsManager.GetJobNames()
if jobs_blacklist or jobs_whitelist:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check here for specifying both? Since that will cause an exception later anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below.

@@ -125,11 +137,22 @@ def add_evidence(self, evidence_):
log.info('Adding new evidence: {0:s}'.format(str(evidence_)))
self.evidence.append(evidence_)
job_count = 0
jobs_whitelist = evidence_.config.get('jobs_whitelist', [])
jobs_blacklist = evidence_.config.get('jobs_blacklist', [])
if jobs_blacklist or jobs_whitelist:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment about raising here instead of later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually originally had a check that raised an exception here, but @berggren suggested removing it. Unless you feel strongly about this, I think I'll leave it as is since it will still raise an exception anyway. In practice this condition will be caught on start anyway (with another check that occurs during arg parse time).

Copy link
Member Author

@aarontp aarontp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL

"""

def __init__(self):
def __init__(self, jobs_blacklist=None, jobs_whitelist=None):
"""Initialize Turbinia Server."""
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, yeah, it was documented just above in the class attributes, but I realized that they are not actually class attributes, so I moved it here.

jobs_whitelist and jobs_blacklist must not be specified at the same time.

Args:
jobs (list[str]): The names of the jobs to filter.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

jobs_whitelist and jobs_blacklist must not be specified at the same time.

Args:
jobs (list[TurbiniaJob]): The names of the jobs to filter.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"""Does setup of Task manager and its dependencies.

Args:
jobs_blacklist (list): Jobs that will be excluded from running
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Can you point me to a specification for this? Thanks.

@@ -125,11 +137,22 @@ def add_evidence(self, evidence_):
log.info('Adding new evidence: {0:s}'.format(str(evidence_)))
self.evidence.append(evidence_)
job_count = 0
jobs_whitelist = evidence_.config.get('jobs_whitelist', [])
jobs_blacklist = evidence_.config.get('jobs_blacklist', [])
if jobs_blacklist or jobs_whitelist:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually originally had a check that raised an exception here, but @berggren suggested removing it. Unless you feel strongly about this, I think I'll leave it as is since it will still raise an exception anyway. In practice this condition will be caught on start anyway (with another check that occurs during arg parse time).

self._backend_setup(*args, **kwargs)
# TODO(aarontp): Consider instantiating a job per evidence object
job_names = jobs_manager.JobsManager.GetJobNames()
if jobs_blacklist or jobs_whitelist:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below.

@aarontp aarontp requested a review from Onager December 7, 2018 19:24
@aarontp aarontp dismissed berggren’s stale review December 10, 2018 18:15

All review feedback should already be addressed

@aarontp
Copy link
Member Author

aarontp commented Dec 10, 2018

@berggren I ended up having to "dismiss" your review, though AFAIK all of the review feedback has been addressed. There was one comment I hadn't resolved, but it won't let me resolve it or comment on it now (I believe because it refers to a commit that no longer exists due to pushing a rebased-branch). If anyone knows of a better way to resolve reviews without dismissing them, please let me know.

@aarontp aarontp merged commit 4a1afaf into master Dec 10, 2018
@aarontp aarontp deleted the job-filter branch December 10, 2018 18:19
ericzinnikas pushed a commit to ericzinnikas/turbiniafb that referenced this pull request Jan 10, 2019
* Pin specific Celery/Kombu/Redis versions (google#305)

* in TurbiniaTaskResult, input_evidence doesn't have to be a list (google#309)

* input_evidence doesn't have to be a list

* Fix exception

* current pypi version of google-cloud-storage (1.13.0) requires google-cloud-core 0.28.1 (before the rename of google.cloud.iam (core) to google.api_core.iam (google#315)

* Use a link to a "parent Evidence" instead of subclassing (google#296)

* parent evidence

* undo some simplifications for the sake of a simpler CL

* Add some serialization

* update docstring

* Initialize attribute

* set parent evidence if Evidence type is context dependant

* don't pass parent_evidence at instantiation

* undo linter stuff

* comments

* fix aim lib breaking tests

* typo

* Print version on start 3 (google#320)

* Add files via upload

* Delete turbiniactl.py

* Delete turbiniactl.py

* Add files via upload

* Delete turbiniactl.py

* Add files via upload

* Update turbiniactl.py

* Caps

* Quick update to evidence docstrings (google#317)

... to disambiguate between _preprocess() and preprocess().

* Add Job filters (google#247)

* Add job filters

* fix docstrings.

* update docstring

* Get jobs filters working with new job manager

* Refactor out FilterJobObjects into new method

* Update YAPF

* remove missed confict markers

* Docstrings and renaming

* Migrate job graph generator to use new manager (google#321)

* Update Evidence local_path when it's saved (google#319)

* Pin google-cloud-storage to 1.13.0 (google#326)

Fixes google#325

Looks like google-cloud-storage was updated in:
googleapis/google-cloud-python#6741

Which just got released as 1.13.1:
https://pypi.org/project/google-cloud-storage/#history

* Set image export to process all partitions (google#324)

* Add --partitions all to image_export invocations

* Fix typo

* Explicitly set saved_paths to list (google#323)

* Move version print after log level setup (google#322)

* Move version print after log level setup

* Remove extra whitespace

* update the pystyle link (google#333)

* Undefined name: Define 'unicode' in Python 3 (google#337)

* Undefined name: Define 'unicode' in Python 3

__unicode()__ was removed in Python 3 because all __str__ are Unicode.

[flake8](http://flake8.pycqa.org) testing of https://github.com/google/turbinia on Python 3.7.1

$ __flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics__
```
./tools/turbinia_job_graph.py:47:40: F821 undefined name 'unicode'
  parser.add_argument('filename', type=unicode, help='where to save the file')
                                       ^
1     F821 undefined name 'unicode'
1
```

* Placate PyLint

* Added PSQ timeout to 1 week (google#336)

* Error when worker version doesn't match server google#307 (google#327)

* Added turbina_version to TurbinaTask

* First approach

* Changed to no rise error and return instead

* Restored the run from run_wrapper

* Changed format of strings

* Changed words fixed line too long

* bump version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants