Skip to content

Conversation

@ismagilli
Copy link

@ismagilli ismagilli commented Feb 22, 2024

Problem: for files '123.png' and 'img-123.png' current ExportDataAsCode implementation produce uncompilable code, since ExportDataAsCode generates incorrect macro names: 123_DATA[_SIZE] for 123.png (macro name can't starts with digit) and IMG-123_DATA[_SIZE] for ing-123.png (macro name can't contains '-'). The correct С/C++ macro name should contain only '_', letters, digits and not starts with a number. This PR converts all invalid characters to '_', and, if necessary, adds an underscore '_' to the beginning of the macro name.

@orcmid
Copy link
Contributor

orcmid commented Feb 23, 2024

I think the only edge case to handles is ensuring the rewritten identifier does not have an initial "_" followed by a capital letter nor creating an initial "__" (2 underscores). The preprocessor names in such forms are reserved for pre-defined names in C Language implementations.

@ismagilli
Copy link
Author

@orcmid I didn't fully understand you. Do you suggest not to consider the cases I mentioned and try to fix only the prefix?

@orcmid
Copy link
Contributor

orcmid commented Feb 23, 2024

I was warning about intruding on reserved identifiers as the result of inserting "_" and, now that I notice, capitalizing the letters. (The use of _SIZE is already a problem here. _size is more correct.)

And does GetFileNameWithoutExt actually deliver a path?

The problem seems to be that the filename must be one that, absent the extension, is a valid C Language identifier that is not reserved. There are innumerable ways that a filename might not satisfy that condition, and I have no idea how that can be overcome by the proposed rewriting.

Maybe the simple fix is to prefix with one _ if the filename starts with a digit, and replace any non-alphanumeric characters with _ and leave it at that. (This will change "" to "" but no harm done.) I think that's the best that can be done. I would not capitalize. There is no requirement to capitalize preprocessor names being created for this purpose, if it's actually created with a #define.

There can still be some ugly edge cases. An easy way out of all cases is to always prefix the name with something like g_ (for generated) and then simply replace any non-alphanumeric with _. This will avoid any collisions with C Language keywords and built-in preprocessor macros. It is also easier to explain.

@ismagilli
Copy link
Author

ismagilli commented Feb 23, 2024

Understood. Since this code has been around for a long time, many people have managed to use it. It seemed to me that it was not worth breaking backward compatibility, so I did not make too many changes. If you think it is acceptable to break backward compatibility, I will correct based on your message.

I note that the code I suggested adds an underscore to the beginning of the text only if the text starts with a digit. Therefore, double underscores at the beginning are only possible if the file name began with two underscores.

And does GetFileNameWithoutExt actually deliver a path?

No.

@orcmid
Copy link
Contributor

orcmid commented Feb 23, 2024

@ismagilli

If you think it is acceptable to break backward compatibility, I will correct based on your message.
I note that the code I suggested adds an underscore to the beginning of the text only if the text starts with a digit. Therefore, double underscores at the beginning are only possible if the file name began with two underscores.

I think invalid preprocessor names can also be produced if the filename begins with one or two non-alphanumeric characters.

Suggestion: Reduce it to two cases.

  1. If the first character is a numeral, provide a leading "_".
  2. If any character is non-alphanumeric, replace it with "_".

Ensure that case 1 indexes properly, so the transfer of file-name characters starts after the initial "_". Also, design it so that if the filename with or without extension is null, there will at least be a "_" varFileName[] result.

I suggest doing

char *tmpFileName = GetFileNameWithoutExt(filename);

Then set varFileName[0] to a leading '_' depending on tmpFileName[0]. This also will determine if you copy tmpFileName[] starting with initial destination varFileName[0]orvarFileName[1]`.

Do either a safe strcpy or instead do your own copy loop, checking each tmpFileName[i] and copying that or '_' to `varFileName[j] depending on whether the character is alphanumeric or not.

You might as well do your own copy loop, since you are going to look at all of them anyhow.

Finally, defend against over-running varFileName[] and ensure there is at least one '\0' remaining at the end of that buffer.

Is this helpful?

PS: I don't know what limitation there might be on preprocessor identifier length. There is probably a minimum maximum length in accordance with the ISO Standard. It might be helpful to use that as the length (+1) for varFileName[].

PPS: We are also assuming C Language compatible run-time character sets for filenames. I don't know how locale-dependence (and possible UTF8) can be coped with in situations like this one. This has me agree that there might need to be a function in raylib.h for the cases of identifier generation from filenames.

@Peter0x44
Copy link
Contributor

Peter0x44 commented Feb 23, 2024

I checked what vim xxd does. It prepends __ to the filenames beginning with digits, and changes - to _.
https://github.com/vim/vim/blob/315cd1fbcbba7c44ec8743f03645bfcaef58bd55/src/xxd/xxd.c#L30-L32
https://github.com/vim/vim/blob/315cd1fbcbba7c44ec8743f03645bfcaef58bd55/src/xxd/xxd.c#L1009

That's all the special behavior I can find.

@Peter0x44
Copy link
Contributor

Peter0x44 commented Feb 23, 2024

This has me agree that there might need to be a function in raylib.h for the cases of identifier generation from filenames.

I don't agree. It's niche, not an API most users will ever touch.

@orcmid
Copy link
Contributor

orcmid commented Feb 23, 2024

@Peter0x44 > > This has me agree that there might need to be a function in raylib.h for the cases of identifier generation from filenames.

I don't agree. It's niche, not an API most users will ever touch.

I think the need for an external implementation is because of the 3 places this is done for generation of code. It's not so much about raylib.h as a utility function that would be good to have maintained in a single place. It needs a careful shared implementation.

Is there a better place, maybe utils.c and utils.h, for parking this? (I shudder to think it might need to be back-end dependent instead.)

@Peter0x44
Copy link
Contributor

@orcmid I guess rtext.c would be it. I was talking about exposing a public API though.

@ismagilli
Copy link
Author

I think utils.{c,h} better for place this function since rtext is about loading fonts and drawing text.

@Peter0x44
Copy link
Contributor

@ismagilli it also has functions like TextCopy, IsFileExtension, etc, that are used elsewhere throughout raylib

@ismagilli
Copy link
Author

TextCopy and IsFileExtension is a part of the public API, but the function we are talking about is not. If we put a function in rtext, then we have to duplicate the function declaration in each file in which we want to use this function.

@Peter0x44
Copy link
Contributor

True. I guess Ray has the real final say there.
Duplicating this logic isn't terrible but also not ideal.

@ismagilli
Copy link
Author

ismagilli commented Feb 23, 2024

I would like to thank @orcmid for his patience and detailed explanations of his vision. English is my second language and this has become an obstacle to understanding some places in his messages.

I have made changes to the code and will update the PR code now. I did not delete the code that replaces lowercase letters with uppercase letters in order to save backward compatibility. In addition, I made it impossible for two underscores to appear in a row (not only at the beginning of the macro name). I omitted the check for going beyond the boundaries of the array in this function. There are a lot of places in the existing code where such a check is required, but it is missing.

Let's wait for raysan's decision.

@raysan5
Copy link
Owner

raysan5 commented Feb 23, 2024

@ismagilli @orcmid @Peter0x44 After reviewing this issue carefully, I decided not to merge it.

It's not only because it adds a new function with a possible coupling between modules or redundant code but the main concern is that it's impossible to address all possible user filenames issues, for example with special characters, spaces or other languages charsets.

I prefer to keep it as is and force the user to be careful with its naming conventions. If it fails, it't up to the user.

@raysan5 raysan5 closed this Feb 23, 2024
@ismagilli
Copy link
Author

@raysan5 thanks for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants