Skip to content

build a simple google translate example around simple_report.py (NO_J… #70

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 13, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
256 changes: 256 additions & 0 deletions notebooks/API_Training_Exercises/01_Simple_Report.ipynb
Copy link
Contributor

@Alex-AMC Alex-AMC Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove the output of all the cells ( I think the last one in particular is a bit too big) Cell 3 less so.

A suggestion and not necessary - I can see that you've added the dependencies as googletrans, might be nice to have a cell that installs it using magic commands

!conda install googletrans 
or 
!pip install googletrans 

That way, the user doesn't have to leave the notebook :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try that but for me it proved somewhat problematic (basically just failed)

Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"This script can be used for any purpose without limitation subject to the\n",
"conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx\n",
"\n",
"This permission notice and the following statement of attribution must be\n",
"included in all copies or substantial portions of this script.\n",
"\n",
"2024-09-11: Made available by the Cambridge Crystallographic Data Centre.\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# International Report\n",
"\n",
"This notebook talks through how to create a report that reports a CSD entry ... but uses google translate to translate headings into the language of user choice.\n",
"You could use the ideas her to convert the 'simple_report.py script into an international report\n",
"\n",
"#### Prerequisites\n",
"\n",
"First install the CSD Python API and googletrans into your conda or pip environment. This should only be needed the before you run the script."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we are ready to start some python coding. We need to import some modules. We are writing a script that ultimately will be used in Mercury so we will want some of \n",
"the special utilities available for writing reports."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from ccdc.utilities import html_table\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can create a translator for the script to use, and define our language of choice. Lets write a little function to do the translation"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from googletrans import Translator\n",
"\n",
"translator = Translator()\n",
"def tr(text, lang):\n",
" if lang is None:\n",
" return text\n",
" try:\n",
" return translator.translate(str(text),src=\"en\",dest=lang).text\n",
" except Exception as e:\n",
" return text"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's try it out:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"tr(\"Mary had a little lamb\", \"ar\") # Lets try arabic!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So now we can create a report on a CSD entry that is international. The following function will work in a Mercury API script. Note how it defines an interface object"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def get_coordinates(molecule, round_digits=None):\n",
" \"\"\"Yield the label and fractional coordinates of all atoms in the molecule.\n",
"\n",
" :param molecule: (:obj:`ccdc.molecule.Molecule`) The molecule for which to return coordinates.\n",
" :param round_digits: (:obj:`int`) How many decimal digits coordinates should be rounded to.\n",
" :returns: (:obj:`list`) List of the label and fractional x/y/z coordinates for each atom\n",
" in the molecule in format ['Atom label', 'X coordinate', 'Y coordinate', 'Z coordinate'].\n",
" \"\"\"\n",
" for atom in molecule.atoms:\n",
" try:\n",
" x, y, z = atom.fractional_coordinates\n",
" yield [atom.label,\n",
" x if round_digits is None else round(x, round_digits),\n",
" y if round_digits is None else round(y, round_digits),\n",
" z if round_digits is None else round(z, round_digits)]\n",
" except TypeError:\n",
" continue\n",
"\n",
"\n",
"def main(interface=None, lang=None):\n",
" \"\"\"Generate a simple report based on the entry currently selected in the Mercury UI.\n",
"\n",
" :param interface: (:obj:`ccdc.utility.ApplicationInterface`) An ApplicationInterface instance.\n",
" \"\"\"\n",
" if interface is None:\n",
" from ccdc.utilities import ApplicationInterface\n",
" interface = ApplicationInterface()\n",
"\n",
" entry = interface.current_entry\n",
"\n",
" # Open a HTML report. This will create the file, copy the CSD Python API\n",
" # default template for reports and fill in the headline/page title.\n",
" with interface.html_report(title=tr('Simple report on ',lang)+ ' ' + entry.identifier) as report:\n",
"\n",
" # Write the section header for Entry Details\n",
" report.write_section_header(tr('Entry Details',lang))\n",
" # Assemble a list of information and labels to go into the \"Entry Details\" table\n",
" entry_details = [\n",
" ['<b>'+tr('Chemical name',lang)+'</b>', entry.chemical_name],\n",
" ['<b>'+tr('Synonyms',lang)+'</b>', entry.synonyms],\n",
" ['<b>'+tr('Formula',lang)+'</b>', entry.formula],\n",
" ['<b>'+tr('R-factor',lang)+'</b>', entry.r_factor],\n",
" ['<b>'+tr('Disorder',lang)+'</b>', tr(entry.disorder_details,lang)],\n",
" ['<b>'+tr('Polymorphism',lang)+'</b>', tr(entry.polymorph,lang)],\n",
" ['<b>'+tr('3D structure',lang)+'</b>', tr(entry.has_3d_structure,lang)],\n",
" ['<b>'+tr('Organic',lang)+'</b>', tr(entry.is_organic,lang)],\n",
" ['<b>'+tr('Polymeric',lang)+'</b>', tr(entry.is_polymeric,lang)],\n",
" ['<b>'+tr('Bioactivity',lang)+'</b>', tr(entry.bioactivity,lang)],\n",
" ['<b>'+tr('Source',lang)+'</b>', tr(entry.source,lang)],\n",
" ['<b>'+tr('Habit',lang)+'</b>', tr(entry.habit,lang)],\n",
" ]\n",
" # Generate a HTML table from the entry details and write it to the report\n",
" report.write(html_table(data=entry_details, table_id='entry_details'))\n",
"\n",
" # Write the section header for Fractional Coordinates\n",
" report.write_section_header(tr('Fractional Coordinates',lang))\n",
" # Get the coordinates of all the atoms from the entry and write them to a HTML table\n",
" report.write(html_table(data=list(get_coordinates(entry.molecule, round_digits=3)),\n",
" table_id='fractional_coordinates',\n",
" header=[tr('Atom',lang), 'x', 'y', 'z']))\n",
"\n",
" # Write the section header for Publication Details\n",
" report.write_section_header(tr('Publication Details',lang))\n",
" # Assemble a list of information and labels for the Publication Details table\n",
" publication_details = [\n",
" ['<b>'+tr('Reference',lang)+'</b>', '%s Volume %s, %s' % (getattr(entry.publication, 'journal_name', ''),\n",
" entry.publication.volume,\n",
" entry.publication.year)],\n",
" ['<b>'+tr('Authors', lang)+'</b>', entry.publication.authors],\n",
" ['<b>'+tr('Document Object Identifier',lang)+'</b>', entry.publication.doi]\n",
" ]\n",
" # Write the publication details to a HTML table\n",
" report.write(html_table(publication_details, table_id='publication_details'))\n",
"\n",
" # Write the section header for Basic Crystallographic Information\n",
" report.write_section_header(tr('Basic Crystallographic Information',lang))\n",
" # Assemble a list of basic crystal information and labels for the table\n",
" crystallographic_data = [\n",
" ['<b>'+tr('Crystal System',lang)+'</b>', entry.crystal.crystal_system],\n",
" ['<b>'+tr('Space Group',lang) + '</b>', entry.crystal.spacegroup_symbol],\n",
" ['<b>'+tr('Cell Volume',lang)+ '</b>', '%s ų' % round(entry.crystal.cell_volume, 3)],\n",
" ['<b>Z, Z\\'</b>', entry.crystal.z_prime],\n",
" ]\n",
" # Write the crystallographic details to a HTML table\n",
" report.write(html_table(crystallographic_data, table_id='crystallographic_information'))\n",
"\n",
" # Once the HTMLReport is closed (e.g. when the with: branch above ends),\n",
" # it will automatically write the appropriate HTML footer."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Lets use our function. As we are in a notebook, we will have to define a dummy interface file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"from ccdc.utilities import ApplicationInterface\n",
"interface = ApplicationInterface(parse_commandline=False)\n",
"interface.identifier = \"AABHTZ\"\n",
"interface.output_html_file = f'{interface.identifier}_{time.strftime(\"%H%M%S\", time.gmtime())}_report.html'\n",
"\n",
"main(interface,\"ar\") # Arabic "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally use Jupyter to display it:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"HTML(interface.output_html_file)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.20"
}
},
"nbformat": 4,
"nbformat_minor": 4
}