Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-28009: Fix uuid.uuid1() and uuid.get_node() on AIX #8672

Merged
merged 43 commits into from
Sep 26, 2019

Conversation

aixtools
Copy link
Contributor

@aixtools aixtools commented Aug 4, 2018

FYI: linux et al use b':' to separate the 6 fields of a MAC address, also the fields are always two characters. On AIX the character b'.' is used, and when the decimal value is < 16, only one character is used.

Was: b'00:0a:10:20:30:40', on AIX - could be: b'0.a.10.20.30.40'

p.s. I hope this can be considered for backporting.

https://bugs.python.org/issue28009

correct the field seperator character of ethernet adapter MAC address on AIX
Lib/uuid.py Outdated
if word.count(_mac_delim) == 5:
if len(word) == 17:
mac = int(word.replace(_mac_delim, b''), 16)
elif len(word) < 17 and len(word) >= 11:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this change. Is it only needed on AIX? I would prefer to keep ==17 test on other platforms.

Copy link
Contributor Author

@aixtools aixtools Aug 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I do not have access to other platforms I was trying to write something that was generic.
Roughly - first test for the _MAC_DELIM count.
Then - if the overall length is 17 the characters will be the correct hex value - when evaluated, so the splitting into individual hex values and adding and shifting is not needed.
Else: each individual hex value must be added in separately AFTER an 8/bit left shift of the previous value.

Otherwise a macaddr from:
en0 1500 link#2 fa.d1.8c.f7.62.4 1601514231 0 664547436 0 0

returns as:
fad18cf7624 (fad 18cf 7624, dec ‭17236120008228‬) rather than as
fad18cf76204 (fad1 8cf7 6204 , dec 275777920131588)

If you still want it changed, please comment again after push in about 5 minutes)

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@aixtools aixtools changed the title bpo-28009: correct character used to seperate ethernet MAC address for AIX bpo-28009: Fix uuid.uuid1() and uuid.get_node() on AIX Aug 22, 2018
@aixtools
Copy link
Contributor Author

pushed a small change because there was a "phase failure" in one of the tests (agent did not start).

Copy link
Contributor Author

@aixtools aixtools left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes done, or comments added

…e getnode()

Move _netstat_getnode() find alg so that it can be tested using unittest.mock
Add new test `test_uuid/find_mac_netstat` for AIX and netstat
@aixtools
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@vstinner: please review the changes made to this pull request.

command='netstat',
args='-ia',
hw_identifiers=b'Address',
get_index=lambda x: x * 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you need to use i+1 here. It looks like a bug.

Copy link
Contributor Author

@aixtools aixtools Aug 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess my replies did not make it here.
a) It is "* 1" not plus 1
b) it may indicate a bug (not in the test I fear, but elsewhere in core) - when I was trying to use the find_mac code to parse netstat output I first tried "get_index=lambda x: x", but it always evaluated as get_index=lambda x: x + 1. I may have also tried x + 0, but I do not recall. Before merging I can experiment with other values - and if they are not working that is a different bug - not the one I am trying to resolve.

Copy link
Contributor

@taleinat taleinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my recent changes, this LGTM.

I'd like to get approval from another core dev, though. Or perhaps @zestyping (the original author of the uuid module)?

Copy link
Contributor

@zestyping zestyping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Here are my two cents. I realize my suggestions involve substantial changes and I don't have prior context on the evolution of this PR, so feel free to take with a grain of salt. I hope they are somewhat helpful, though. I don't have authority over this file, merely opinions.

As a general note, it seems to me that it's more robust to write code that works for all cases (e.g. split on both colons and periods; always tolerate hex bytes that have one or two characters) rather than different behaviours for each platform, which might be more likely to break if the platform detection is incorrect or if the format changes slightly on a given platform.

Lib/uuid.py Outdated
@@ -384,48 +394,103 @@ def _popen(command, *args):
def _is_universal(mac):
return not (mac & (1 << 41))

def _find_mac(command, args, hw_identifiers, get_index):

# In the next two fucnctions:
Copy link
Contributor

@zestyping zestyping Jul 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the general approach here, and it makes sense. These two functions need a better explanation of what they do, though—the comments here aren't enough for a reader to understand how they're intended to work, and the argument names hw_identifiers and f_index are quite cryptic.

Here's an attempt at some better names and a docstring:

def _find_mac_near_keyword(command, args, keywords, get_word_index):
    """Searches the output of a command for a MAC address near a keyword.

    Each line of words in the output is lowercased and searched for any of
    the given keywords.  Upon a match, get_word_index is invoked to pick a
    word from the line, given the index of the match (e.g. lambda i: 0 gets the
    first word on the line; lambda i: i - 1 gets the word preceding the keyword)."""

Given that _find_mac_nextlines is supposed to look for words under column headings, there's no need for it to take an f_index function, and the third argument can be named more clearly. Indeed, it would be misleading to call the third argument hw_identifiers, as it's a single string, not a list of strings. Here's a try:

def _find_mac_under_heading(command, args, heading):
    """Looks for a MAC address under a heading in the output of a command.

    The first line of words in the output is searched for the given heading.
    Words at the same word index as the heading in subsequent lines are then
    examined to see if they look like MAC addresses."""

Another concern I have is that the logic about what looks like a MAC address is mixed in with the logic about where to find it. It would be ideal to factor it out, like this:

def _find_near_keyword(command, args, keywords, get_word_index):
    """Yields selected nearby words for keywords found in the output of a command.
    ...
        yield _parse_mac(word)
    ...

def _find_mac_under_heading(command, args, heading):
    """Yields words found under a heading in the output of a command.

    The first line of words in the output is searched for the given heading.
    For each subsequent line, the word at the same word index as the heading
    in subsequent lines is then yielded."""
    
def _parse_mac(word):
    """Converts a string to a 48-bit integer if it looks like a MAC address, or returns None."""
    # A MAC address is six bytes, each printed as one or two hex digits.
    # They may be separated by ':' or '.' depending on the platform.
    parts = word.replace('.', ':').split(':')
    if len(parts) == 6 and all(len(part) <= 2 for part in parts):
        hexstr = b''.join(part.rjust(2, b'0') for part in parts)
        try:
            return int(hexstr, 16)
        except ValueError:
            pass

def _select_best_mac(macs):
    """Chooses a MAC from the given iterable, preferring any universally administered
    MAC address over locally administered MAC addresses."""
    local_macs = []
    for mac in macs:
        if _is_universal(mac):
            return mac
        elif mac:
            local_macs.append(mac)
    if local_macs:
        return local_macs[0]

Then:

_select_best_mac(_find_near_keyword('ifconfig', args, keywords, lambda i: i + 1))
_select_best_mac(_find_under_heading('netstat', '-ia', b'Address')

etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
Very interesting to read up better ways to do things, e.g., "parts = word.replace('.', ':').split(':')"

@taleinat
Copy link
Contributor

Thanks for the review, @zestyping! For now, I've implemented those of your suggestions which don't entail refactoring beyond the scope of adding support for AIX.

I've also implemented @ncoghlan's suggestion of adding a _MAC_OMITS_LEADING_ZEROES platform-dependent constant, to avoid direct OS checks when parsing MAC addresses.

At this point I've reviewed all existing comments on this PR and implemented all of those which seemed in scope. Assuming the tests pass, this is good to go IMO. @ncoghlan, @vstinner, what do you think?

@taleinat
Copy link
Contributor

Also, should this be cherry-picked into 3.8?

@aixtools
Copy link
Contributor Author

Also - @taleinat - many thanks for your assistance!

@zestyping
Copy link
Contributor

Thanks, @taleinat!

For now, I've implemented those of your suggestions which don't entail refactoring beyond the scope of adding support for AIX.

That's a very reasonable constraint.

Can you help me understand your thinking behind keeping _MAC_DELIM and _MAC_OMITS_LEADING_ZEROES around, instead of using logic that always accommodates both delimiters and accommodates bytes with one or two hex digits?

@taleinat
Copy link
Contributor

Can you help me understand your thinking behind keeping _MAC_DELIM and _MAC_OMITS_LEADING_ZEROES around, instead of using logic that always accommodates both delimiters and accommodates bytes with one or two hex digits?

I kept them due to @vstinner's request to have this change only affect AIX. Your suggestion could potentially have side-effects on all platforms.

Also, the Windows-specific _ipconfig_getnode() actually implements similar logic, but uses yet another delimiter, '-', and slightly different logic. So, ideally we would refactor that too to use a single set of functions for MAC address recognition and parsing. Given this, I prefer to first get AIX support in (which is working and in rather good shape) and then refactor these internals.

@zestyping
Copy link
Contributor

That makes sense. Thanks for explaining!

@aixtools
Copy link
Contributor Author

@vstinner - I thought I had addressed all your requests. If not, please help with identifying what I missed. Git overwhelms me from time to time.

p.s. I am updating the blurb to be closer to the additional requests of @ncoghlan

Thanks!

@taleinat
Copy link
Contributor

taleinat commented Sep 4, 2019

@ncoghlan, @vstinner, I think this is ready to be merged, but I'd like the final opinion of at least one of you before doing so.

Also, given that the final beta of 3.8 has been released, should this only go into 3.9?

Copy link
Contributor

@ncoghlan ncoghlan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still a few test skips that I think would be better off checking for the specific assumption(s) they're making, rather than checking for AIX as a platform, but I don't think those are a merge blocking issue.

@@ -674,26 +674,57 @@ class TestUUIDWithExtModule(BaseTestUUID, unittest.TestCase):
class BaseTestInternals:
_uuid = py_uuid

# data is specific to AIX - with '.' as _MAC_DELIM
# and strings shorter than 17 bytes (no leading 0)
@unittest.skipUnless(_AIX, 'requires AIX')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that this is a valid test anywhere that those two conditions are true, so rather than testing for "AIX_" here, you can just test for the MAC notation conventions that the test cares about.

@taleinat
Copy link
Contributor

@ncoghlan

There are still a few test skips that I think would be better off checking for the specific assumption(s) they're making, rather than checking for AIX as a platform, but I don't think those are a merge blocking issue.

Are you looking at the latest version of this PR? I reworked all of those tests and there are no longer any such skip-checks.

@taleinat
Copy link
Contributor

Also, assuming this is now ready to be merged, should it be back-ported to 3.8? (Seems like too big a change for 3.7.)

@ncoghlan
Copy link
Contributor

@taleinat Indeed, I don't think GH was showing me the right thing (it's possible it was attempting to show me "changes since your last review" and the outcome was misleading). I've reviewed the actual latest version now, and my +1 still holds :)

Agreed this is definitely out of scope for 3.7.

The case with a 3.8 backport is murky - the challenge with all the AIX patches is that handling AIX properly is a new feature (since it behaves differently from other *nix platforms in some cases), so "fix bug that only occurs on AIX" isn't an obviously suitable candidate for backporting.

Given that ambiguity, we should probably pass the question to @ambv as 3.8 release manager.

@aixtools
Copy link
Contributor Author

The backport to 3.8 is not a show-stopper. Nice to have, as it was "developed" on 3.7 and 3.8, and time has moved on. Why not a show-stopper? The test suite does not see this as an error, so it does not affect the bot status.

So, the backport is more about how much importance someone gives this.

Thank you all for your time reviewing!

@taleinat taleinat merged commit 0bcbfa4 into python:master Sep 26, 2019
jacobneiltaylor pushed a commit to jacobneiltaylor/cpython that referenced this pull request Dec 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants