Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pepnovo: add aarch64/arm64 build #51206

Merged
merged 3 commits into from
Oct 8, 2024
Merged

Conversation

martin-g
Copy link
Contributor

@martin-g martin-g commented Oct 7, 2024

Describe your pull request here


Please read the guidelines for Bioconda recipes before opening a pull request (PR).

General instructions

  • If this PR adds or updates a recipe, use "Add" or "Update" appropriately as the first word in its title.
  • New recipes not directly relevant to the biological sciences need to be submitted to the conda-forge channel instead of Bioconda.
  • PRs require reviews prior to being merged. Once your PR is passing tests and ready to be merged, please issue the @BiocondaBot please add label command.
  • Please post questions on Gitter or ping @bioconda/core in a comment.

Instructions for avoiding API, ABI, and CLI breakage issues

Conda is able to record and lock (a.k.a. pin) dependency versions used at build time of other recipes.
This way, one can avoid that expectations of a downstream recipe with regards to API, ABI, or CLI are violated by later changes in the recipe.
If not already present in the meta.yaml, make sure to specify run_exports (see here for the rationale and comprehensive explanation).
Add a run_exports section like this:

build:
  run_exports:
    - ...

with ... being one of:

Case run_exports statement
semantic versioning {{ pin_subpackage("myrecipe", max_pin="x") }}
semantic versioning (0.x.x) {{ pin_subpackage("myrecipe", max_pin="x.x") }}
known breakage in minor versions {{ pin_subpackage("myrecipe", max_pin="x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
known breakage in patch versions {{ pin_subpackage("myrecipe", max_pin="x.x.x") }} (in such a case, please add a note that shortly mentions your evidence for that)
calendar versioning {{ pin_subpackage("myrecipe", max_pin=None) }}

while replacing "myrecipe" with either name if a name|lower variable is defined in your recipe or with the lowercase name of the package in quotes.

Bot commands for PR management

Please use the following BiocondaBot commands:

Everyone has access to the following BiocondaBot commands, which can be given in a comment:

@BiocondaBot please update Merge the master branch into a PR.
@BiocondaBot please add label Add the please review & merge label.
@BiocondaBot please fetch artifacts Post links to CI-built packages/containers.
You can use this to test packages locally.

Note that the @BiocondaBot please merge command is now depreciated. Please just squash and merge instead.

Also, the bot watches for comments from non-members that include @bioconda/<team> and will automatically re-post them to notify the addressed <team>.

@martin-g martin-g added aarch64 Related to adding linux-aarch64 support osx-arm64 Related to adding osx-arm64 support labels Oct 7, 2024
Copy link
Contributor

coderabbitai bot commented Oct 7, 2024

Warning

Rate limit exceeded

@BiocondaBot has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 15 minutes and 17 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Files that changed from the base of the PR and between 2d8da7e and 6778f8c.

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

This pull request introduces significant updates to the build-fail-blacklist and the pepnovo package. The build-fail-blacklist has been revised to remove several recipes identified as problematic due to various issues like excessive resource consumption, compatibility problems, and specific errors during compilation. Some recipes are flagged for migration to conda-forge, while others have open pull requests related to their migration. The pepnovo package has undergone changes in its build.sh script, where the set -xe command has been added for improved error handling, and the make command now utilizes the -j ${CPU_COUNT} option for parallel compilation. Furthermore, the meta.yaml file for pepnovo has been updated to reflect a new build number, the addition of run exports for dependency management, and expanded platform support, now including linux-aarch64 and osx-arm64. These changes collectively enhance the build process and package compatibility.

Possibly related PRs

Suggested labels

please review & merge


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
recipes/pepnovo/build.sh (1)

9-9: Great optimization for parallel compilation!

The addition of -j ${CPU_COUNT} to the make command enables parallel compilation, which can significantly speed up the build process on multi-core systems. This is an excellent optimization.

Consider adding a fallback value for CPU_COUNT in case it's not set:

-make -j ${CPU_COUNT} CC="${CXX}" CFLAGS="${CXXFLAGS} -Wno-narrowing " LDFLAGS="${LDFLAGS}"
+make -j ${CPU_COUNT:-1} CC="${CXX}" CFLAGS="${CXXFLAGS} -Wno-narrowing " LDFLAGS="${LDFLAGS}"

This ensures that if CPU_COUNT is not set, the build will still proceed with a single job.

recipes/pepnovo/meta.yaml (1)

13-14: Consider a more conservative pinning strategy.

The addition of run_exports is a good practice for managing dependencies. However, the current configuration (max_pin=None) allows any future version of pepnovo to satisfy the dependency requirement. This might be too permissive.

Consider using a more conservative pinning strategy, such as max_pin='x.x' or max_pin='x', depending on your versioning scheme and how often you expect to make breaking changes. For example:

run_exports:
  - {{ pin_subpackage('pepnovo', max_pin='x.x') }}

This would ensure that only minor version updates are automatically accepted, reducing the risk of unexpected breaking changes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 1bc7f1b and 608e2dc.

📒 Files selected for processing (3)
  • build-fail-blacklist (0 hunks)
  • recipes/pepnovo/build.sh (1 hunks)
  • recipes/pepnovo/meta.yaml (2 hunks)
💤 Files with no reviewable changes (1)
  • build-fail-blacklist
🧰 Additional context used
🔇 Additional comments (3)
recipes/pepnovo/build.sh (1)

3-3: Excellent addition for improved error handling and debugging!

The set -xe command is a great addition to the build script. It enables immediate exit on any command failure (-e) and prints each command before execution (-x). This will make debugging easier and ensure that the build process stops if any step fails.

recipes/pepnovo/meta.yaml (2)

12-12: LGTM: Build number increment is appropriate.

The build number has been correctly incremented from 3 to 4. This is in line with best practices for package management, especially when making changes that don't affect the package version itself.


33-35: LGTM: New platform support added as intended.

The addition of 'linux-aarch64' and 'osx-arm64' to the 'additional-platforms' list is in line with the PR's main objective of adding aarch64/arm64 build support. This change will enable the package to be built and used on ARM-based systems, increasing its compatibility and reach.

To ensure full compatibility, please run the following script to check for any architecture-specific code or dependencies that might need adjustment:

Please review the output of this script to ensure that there are no unexpected architecture-specific elements that might cause issues on the new platforms.

recipes/pepnovo/build.sh Outdated Show resolved Hide resolved
Signed-off-by: Martin Tzvetanov Grigorov <mgrigorov@apache.org>
recipes/pepnovo/build.sh Outdated Show resolved Hide resolved
Trying to fix:
```
base64.cpp:124:11: error: ISO C++17 does not allow 'register' storage class specifier [-Wregister]�[0m
2024-10-08T08:41:28.5270550Z 08:41:27 �[32mBIOCONDA INFO�[0m (ERR)       int register a;�[0m
```
@martin-g
Copy link
Contributor Author

martin-g commented Oct 8, 2024

@BiocondaBot please fetch artifacts

@BiocondaBot
Copy link
Collaborator

Package(s) built are ready for inspection:

Arch Package Zip File / Repodata CI Instructions
linux-64 pepnovo-20101117-h4ac6f70_4.tar.bz2 LinuxArtifacts.zip Azure
showYou may also use conda to install after downloading and extracting the zip file. From the LinuxArtifacts directory: conda install -c ./packages <package name>
osx-64 pepnovo-20101117-hac4f329_4.tar.bz2 OSXArtifacts.zip Azure
showYou may also use conda to install after downloading and extracting the zip file. From the OSXArtifacts directory: conda install -c ./packages <package name>
osx-arm64 pepnovo-20101117-h6aa7127_4.tar.bz2 repodata.json CircleCI
showYou may also use conda to install:conda install -c https://output.circle-artifacts.com/output/job/da57b025-d8b8-4fe9-8457-e026ab9c6dfe/artifacts/0/tmp/artifacts/packages <package name>
linux-aarch64 pepnovo-20101117-h78569d1_4.tar.bz2 repodata.json CircleCI
showYou may also use conda to install:conda install -c https://output.circle-artifacts.com/output/job/d8f642e2-81f4-46fe-a567-bc0cf2318f11/artifacts/0/tmp/artifacts/packages <package name>

Docker image(s) built:

Package Tag CI Install with docker
pepnovo 20101117--h4ac6f70_4 Azure
showImages for Azure are in the LinuxArtifacts zip file above.gzip -dc LinuxArtifacts/images/pepnovo:20101117--h4ac6f70_4.tar.gz | docker load

@martin-g
Copy link
Contributor Author

martin-g commented Oct 8, 2024

mgrigorov in 🌐 euler-arm-22 in /tmp/pepnovo took 7s 
❯ tar xvf pepnovo-20101117-h78569d1_4.tar.bz2 
info/files
info/run_exports.json
info/test/run_test.sh
info/recipe/.gitattributes
info/hash_input.json
info/paths.json
info/index.json
info/recipe/build.sh
info/recipe/0001-fix-type-error-in-ReScoreDB.cpp.patch
info/recipe/meta.yaml.template
info/licenses/LICENSE
info/recipe/0002-fix-float-positive-infinity.patch
info/recipe/meta.yaml
info/recipe/conda_build_config.yaml
info/about.json
info/git
bin/pepnovo

mgrigorov in 🌐 euler-arm-22 in /tmp/pepnovo 
❯ file bin/*
bin/pepnovo: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, not stripped

mgrigorov in 🌐 euler-arm-22 in /tmp/pepnovo 
❯ ./bin/pepnovo
***************************************************************************

Error: Missing model name!

PepNovo+ - de Novo peptide sequencing and
MS-Filter - spectal quality scoring, precursor mass correction and chage determination.
Release 20101117.
All rights reserved to the Regents of the University of California.

Required arguments:
-------------------

-model <model name>

-file <path to input file>  - PepNovo can analyze dta,mgf and mzXML files
   OR
-list <path to text file listing input files>


Optional PepNovo arguments: 
----------------------------- 
-prm		- only print spectrum graph nodes with scores.
-prm_norm   - prints spectrum graph scores after normalization and removal of negative scores.
-correct_pm - finds optimal precursor mass and charge values.
-use_spectrum_charge - does not correct charge.
-use_spectrum_mz     - does not correct the precursor m/z value that appears in the file.
-no_quality_filter   - does not remove low quality spectra.
-output_aa_probs	 - calculates the probabilities of individual amino acids.
-output_cum_probs    - calculates the cumulative probabilities (that at least one sequence upto rank X is correct).
-fragment_tolerance < 0-0.75 > - the fragment tolerance (each model has a default setting)
-pm_tolerance       < 0-5.0 > - the precursor masss tolerance (each model has a default setting)
-PTMs   <PTM string>    - seprated  by a colons (no spaces) e.g., M+16:S+80:N+1
-digest <NON_SPECIFIC,TRYPSIN> - default TRYPSIN
-num_solutions < 1-2000 > - default 20
-tag_length < 3-6> - returns peptide sequence of the specified length (only lengths 3-6 are allowed).
-model_dir  < path > - directory where model files are kept (default ./Models)

-max_pm	         <X> - X is the maximal precursor mass to be considered (good for shorty searhces).

Optional MS-Filter arguments:
-----------------------------
-min_filter_prob <xx=0-1.0> - filter out spectra from denovo/tag/prm run with a quality probability less than x (e.g., x=0.1)
-pmcsqs_only   - only output the corrected precursor mass, charge and filtering values
-filter_spectra <sqs thresh> <out dir>  - outputs MGF files for spectra that have a minimal qulaity score above *thresh* (it is recomended to use a value of 0.05-0.1). These MGF files will be sent to the directory given in out_dir and have a name with the prefix given in the third argument.
 NOTE: this option must be used in conjuction with  "-pmcsqs_only" the latter option will also correct the m/z value and assign a charge to the spectrum.

-pmcsqs_and_prm <min prob> - print spectrum graph nodes for spectra that have an SQS probability score of at least <min prob> (typically should have a value 0-0.2)


Predicting fragmentation patterns:
----------------------------------
-predict_fragmentation <X> - X is the input file with a list of peptides and charges (one per line)
-num_peaks             <N> - N is the maximal number of fragment peaks to predict


Parameters for Blast:
---------------------
-msb_generate_query   - performs denovo sequencing and generates a BLAST query.
-msb_merge_queries    -	takes a list of PepNovo "_full.txt" files, merges them and creates queries(list should be given with -list flag).
-msb_query_name       <X> - the name to be given to the main output file.
-msb_query_size       <X> - max size of MSB query allowed (default X=1000000).
-msb_num_solutions    <X> - number of sequences to generate per spectrum (default X=7).
-msb_min_score        <X> - the minimal MS-Blast score to be included in the query, default X=4.0 .

Citations:
----------
- Frank, A. and Pevzner, P. "PepNovo: De Novo Peptide Sequencing via Probabilistic Network Modeling", Analytical Chemistry 77:964-973, 2005.
- Frank, A., Tanner, S., Bafna, V. and Pevzner, P. "Peptide sequence tags for fast database search in mass-spectrometry", J. Proteome Res. 2005 Jul-Aug;4(4):1287-95.
- Frank, A.M., Savitski, M.M., Nielsen, L.M., Zubarev, R.A., Pevzner, P.A. "De Novo Peptide Sequencing and Identification with Precision Mass Spectrometry", J. Proteome Res. 6:114-123, 2007.
- Frank, A.M. "A Ranking-Based Scoring Function for Peptide-Spectrum Matches", J.Proteome Res. 8:2241-2252, 2009.

Please send comments and bug reports to Ari Frank (arf@cs.ucsd.edu).

LGTM!

@martin-g
Copy link
Contributor Author

martin-g commented Oct 8, 2024

@BiocondaBot please add label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aarch64 Related to adding linux-aarch64 support osx-arm64 Related to adding osx-arm64 support please review & merge set to ask for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants