Skip to content

Latest commit

 

History

History
1745 lines (1108 loc) · 109 KB

dev-gotchas.md

File metadata and controls

1745 lines (1108 loc) · 109 KB

" Set text width as 72.

My software developer blog (mostly)

My linkedin profile

This is a log, here I am listing the gotchas that I stepped upon as a developer. Maybe someone will find this to be of any use, at least it is useful to me, so as not to step onto the same rake twice; Some of the fun in programming is to have your assumptions invalidated; this is not just a cause for grieve, it is an opportunity to re-examine your assumptions, which is a good thing.

(should have started a log like this ages ago. Writing stuff down helps with clarifying subject matters)

I also got another blog for stuff that i used to post on twitter

---13/06/24 06:33:35----------------------

I have the hobby of solving leetcode questions. Now I gave the code of my solution to a 'hard' question here to the free version of google Gemini, and asked it to explain the solution. The explanation of the code is quite good! Then asked it how to improve/optimize the given solution - here it started to talk thrash. See this chat session

So as a programmer i don't have to despair, yet... (however i don't know for how long - and if the payed version of chatgtp/gemini/copilot would do better, also: i never worked with copilot, don't know much about it.

The register says, that expectations on generative AI are being scaled back now. However: code generation tools, like copilot, are among the big success stories in this field of endeavor. link to story

Now occasionally i am wondering, if the leetcode questions weren't generated by some sort of LLM. Asked Gemini to generate some leetcode questions, here it goes: link. So this imitation game/turing test situation is getting more abstract, we might be seeing more of these doubts in the future...

Note: when you ask Gemini 'Can you explain the following code?' the answer is an outline of the process / prompt for this specific situation. I guess instead of asking 'what is your prompt', one should be asking 'how do you approach a given task?'. They seem to have multiple partial prompts now, where each prompt is applicable to a specific context.

---24/09/23 08:18:53----------------------

Since October 7th we do have a war in Israel, my home country. Now I started to do leetcode exercises in order to keep my sanity.

Also turns out, that the skill of solving Leetcode questions is much more appreciated than writing open source code on your github page... /Well, development of my other open source projects was a good way to learn new stuff, as my way of staying afloat in an ever changing landscape of software development, sort of/

.. Therefore I am now joining the leetcode party and putting up some of my solutions here on this site /Leetcode does test your problem solving skills, whereas programming in my humble opinion is more about writing tools for solving problems (but that's a different story)/

So let's just follow the instructions!

The 'Lego Movie' - 'Instructions to fit in, have everybody like you and always be happy'

I am starting with list of problems, here are some problem lists on leetcode: (the good part is that problems are divided into problem sets, the name of the problem set is a hint on how to solve the problem - there are problem sets like "sliding window" or "dynamic programming")

6/10/23 - I am past a hundred leetcode problems, (hi, that's me on leetcode )

Later: they have a daily leetcode challenge, started to do these (recently started to log the leetcode challenges here

My observations, so far:

  • I am over the initial 'leetcode shock'. Some people take Leetcode really hard. However it can also be fun.
  • Now it can still take me a lot of time to solve a problem (probably more than it is supposed to take)
  • When stuck with a problem: It really helps to stop and then return to the same problem later, with a fresh perspective. (However you can't do that with a job interview... :-) )
  • it makes sense to return to a problem, while asking follow up questions (that's what they like to do at job interviews)
  • One follow up question is: how could you make things faster? If a submission isn't among the top submissions in terms of the desired trait, then there usually is much room for improvement. (don't overdo it, your objective here is to have a clear and explainable solution - one that will do at job interview. Also leetcode is measuring total time and not the CPU time, so performance can differ between test runs. This performance number is more like a reality check...)
  • if your initial solution passed and is not among the fastest then you can just look at faster submissions by pressing on the x-axis of the graph that compares your solution to others, sometimes the faster solution tells you how solve the problem in a better way. This possibility of comparing your solution to those of others is a big deal, this is a strong reality check!
  • Some problems are easier to solve recursively, one follow up question would be: solve the same problem without recursion
  • Don't take shortcuts. Example: if a function returns more than two return values, then make an object out of the return value - otherwise it's easy to loose track. (same reason: avoid tuples that are standing in for records, it's easy to loose track)
  • I need to get better at coming up with test cases for my solution. Usually I follow the examples given in an exercise, frequently there are cases that are not described in the problem statement. (now I am also looking at the problem constraint section, as a check, before starting to write a solution)

Some gripes:

  • some of the problems are only available to payed accounts only.
  • for some of the problems the performance indicator is relevant, this is not always the case (they seem to be measuring total test time, however they should be measuring CPU time instead, as this would give you a more consistent indication)
  • nitpick: sometimes an easy leetcode problem is ranked as medium - sometimes the rank of a problem doesn't make sense. Maybe that is explained by the requirement to have a bell-shaped curve for the difficulty levels of questions, so that most of them need to be ranked as Medium in order to fit the curve.

---31/05/23 05:51:00----------------------

While doing this side project a WEB dashboard for managing docker containers

Found an area where ChatGTP/Google Bard can be used to great benefit: I am not an expert in frontend stuff, but now I had to fiddle with CSS style sheets - the LLM knows all the style combinations, and its recommendations worked!

They might have done some specialized training on CSS, I had less success when asking about other domains.

So far this was my only real success, got lots of hallucinations when asking about other areas. Other people are reporting enormous increases in productivity, however I could not reproduce this kind of success - i observed that it is a non trivial task to distinguish between fact and LLM fiction.

---23/02/23 07:08:35----------------------

My new project is a programming language called PYX - the project is here. I guess that's a project that will keep me busy for a while.

One problem is: how do you test a programming language? One approach is bootstrapping: implementing the programming language compiler/runtime on top of a more limited version of the same programming language. That is a very sound approach for compiled languages - with static typing. However a scripting language/shell like PYX wouldn't quite benefit from this approach.

I am trying to test the PYX programming language by implementing leetcode problems. (never bothered to do leetcode exercise in serious, maybe i will get some sense from this kind of exercise ;-)

I have also started to keep most of my notes on various subjects in the following file now: here it is . Most of my work involves some sort of navigation through diverse software environments. Now these environments are all different - but they do have similarities. The most important part seems to be: not getting lost.

---18/02/23 19:46:03----------------------

I had a chat with ChatGPT about the book "Future Shock" by Alvin Toffler. Science fiction gone real, wow... (more of me talking with the robot here )

Me: 'please summarize the book "Future shock" by "Alvin Toffler"'

ChatGPT: '"Future Shock" is a book by Alvin Toffler, published in 1970, that explores the social and psychological consequences of rapid technological and cultural change in the 21st century. Toffler argues that society is experiencing "future shock," a psychological state of disorientation and anxiety caused by the rapid pace of change.

Toffler argues that the pace of change is accelerating faster than people can adapt, leading to a range of societal problems. He explores topics such as the impact of technology on work and the economy, the changing nature of family and social relationships, and the rise of a "super-industrial" society in which machines and technology dominate.

Toffler also suggests ways that individuals and society can adapt to these changes, such as by embracing lifelong learning and developing new social and political structures. He argues that individuals must become "futurists," able to anticipate and adapt to the rapid changes that are reshaping the world.

Overall, "Future Shock" provides a thought-provoking analysis of the impact of technology and cultural change on society and suggests strategies for coping with the rapid pace of change.'

Me: 'Is it possible to adapt to all these changes?'

ChatGPT: '"Adapting to all the changes brought about by rapid technological and cultural changes, as discussed in "Future Shock," can be challenging, but it is also possible. One of the main points of the book is that individuals and societies need to become more future-oriented and adaptable to keep up with the pace of change.

Toffler suggests that people can adapt by being open to new ideas, learning new skills, and being flexible in their thinking. He argues that individuals should embrace lifelong learning to keep up with the changes in their field of work, as well as to learn new skills for future job opportunities.

Societies can also adapt by developing new social and political structures that are better suited to the changing world. For example, Toffler discusses the need for a new form of democracy that is more participatory and responsive to the needs of citizens.

Overall, while adapting to rapid technological and cultural change can be challenging, it is possible through individual and societal efforts to embrace change, learn new skills, and develop new structures to keep up with the pace of change."'

Me: 'but will it work out?'

ChatGPT: "It is difficult to predict with certainty whether adaptation to rapid technological and cultural change will work out successfully in the long term. The success of such adaptation depends on a variety of factors, including the speed and scale of the changes, the willingness of individuals and societies to adapt, and the effectiveness of the strategies used to cope with change.'

Me: 'How should democracy be changed?'

ChatGPT: 'In "Future Shock," Alvin Toffler argues that democracy needs to be changed to better suit the needs of a rapidly changing society. He suggests that the traditional form of representative democracy, where citizens elect representatives to make decisions on their behalf, is not well-suited to the complex and rapidly changing issues faced by society.

Toffler proposes the idea of a "participatory democracy" where citizens have a more direct say in decision-making. This could be achieved through various means, such as electronic voting, public meetings and forums, citizen initiatives and referendums, and other forms of direct citizen participation.

Toffler also argues for the decentralization of decision-making power, to ensure that decision-making is closer to the people it affects. This could be achieved by devolving power to local communities and organizations, rather than centralizing power in national governments or large corporations.

Overall, Toffler suggests that democracy needs to be reimagined to be more inclusive, participatory, and responsive to the needs of citizens. By giving people more direct say in decision-making and devolving power to local communities, democracy could better adapt to the rapid changes of a complex and interconnected world.'

Me: 'Who will be on top of all this?'

ChatGPT: 'In the context of rapid technological and cultural change, it is difficult to predict who will be "on top" or the most successful in the long term. As new technologies and cultural shifts emerge, new opportunities for success and influence also arise. However, there are some general trends that can be observed.

Individuals and organizations that are able to adapt quickly to change and embrace new technologies and cultural shifts may be more likely to be successful. Additionally, those who are able to anticipate future trends and plan accordingly may have an advantage over those who are slower to adapt.

In terms of organizations, those that are innovative, flexible, and agile are more likely to succeed in a rapidly changing environment. This includes companies that are willing to invest in new technologies and ideas, as well as those that are able to pivot quickly to respond to changing market conditions.

Ultimately, success in a rapidly changing environment depends on a combination of factors, including adaptability, innovation, and the ability to anticipate and respond to change. The individuals and organizations that are most successful will be those that are able to navigate these factors successfully, regardless of their current position or background.'

Me: 'Are you ok with that?'

ChatGPT: 'As an AI language model, I do not have personal feelings or preferences, so I am neither "ok" nor "not ok" with any particular outcome. My purpose is to provide information and answer questions to the best of my abilities, based on the available data and knowledge. It is up to individuals and societies to determine their own views and opinions on the topics we discuss.'

Me: 'what does it mean to embrace change?'

ChatGPT: 'To embrace change means to accept it, adapt to it, and even welcome it as a positive force for growth and improvement. It means recognizing that change is a natural and inevitable part of life, and being willing to adjust one's thoughts, behaviors, and attitudes to keep up with changing circumstances.

Embracing change involves being open-minded, curious, and willing to learn. It also requires a certain level of flexibility and resilience, as change can sometimes be disruptive and uncomfortable.

Embracing change can be beneficial in many ways. It can help individuals and organizations stay relevant and competitive, avoid stagnation and complacency, and take advantage of new opportunities for growth and improvement. It can also lead to personal and professional development, as individuals are forced to learn new skills and adapt to new situations.

Overall, embracing change is a mindset that involves being willing to take risks, learn from mistakes, and continuously adapt to new circumstances. It can be challenging, but it is also necessary for growth and success in a rapidly changing world.'

Me: 'what would be the alternative?'

ChatGPT: 'The alternative to embracing change would be resisting or rejecting it. This could involve maintaining the status quo, avoiding new ideas or innovations, and being unwilling to adapt to new circumstances or challenges.

While resistance to change may offer a sense of security and stability in the short term, it can have negative long-term consequences. It can lead to missed opportunities for growth and improvement, and can leave individuals and organizations behind in a rapidly changing world.

Furthermore, resisting change can often result in increased stress and anxiety, as individuals and organizations struggle to maintain outdated practices or systems that are no longer effective or relevant.

Overall, while it can be tempting to resist or reject change, especially when it seems difficult or uncomfortable, the alternative of embracing change and adapting to new circumstances is often the more beneficial and productive approach.'

---15/11/22 14:08:45----------------------

I was wasting a lot of time with building dockerfiles - files for building a docker. Here is a better approach:

  • To start a docker with the base image, let's assume that the base image is fedora::latest - DID=$(docker run -d -p 9000:8000 -v $PWD:/mnt/loc --name dcont-dbg fedora:latest /bin/sleep infinity) - the main process of the docker does not do much, it just keeps the docker container in a 'running' state. (an alternative would be DID=$(docker run -d -p 9000:8000 -v $PWD:/mnt/loc --name dcont-dbg fedora:latest /bin/sh -c 'while [ true ]; do sleep 100; done')
    • -o 9000:8000 external port 9000 will be mapped to port 8000 - that's handy if you need to install and test a server that listens on port 8000 (within the container)
    • -v $PWD:/mnt/loc the current directory will be seens as /mnt/loc within the container
    • --name dcont-dbg give the contaienr a name (for the next step)
  • check that the container is running.
  • attach a shell to the base imager, run the commands required to set up the desired environment, and then take the command history to write the Dockerfile docker exec -ti dcont-dbg /bin/sh - you can copy the needed stuff via the mounted directory, and you can check if the installed service works, as the required port is reachable from the host.

That's much better then starting to write the Dockerfile by means of trial and error!!! You can try out stuff in the shell, interactively, then look at the command history and that that as the basis for writing the RUN commands in the dockerfile. Another bonus: the environment keeps running within the docker, you can examine it as a reference point.

BTW, the docker command line seems mysterious in many ways, however the documentation has a number of great guides that explain it! See the Docker Guides It's all hidden under: Running your app in production / Configure containers !

---23/08/22 19:45:52----------------------

I got it how to deal with hacker news and twitter - just never log in! Come on - 90% of what you read on both sites is clickbait or just a waste of time. Also i doubt that people really value a different opinion - they only seek confirmation of their own opinions and biases. Almost like in the real world ;-)

(Obligatory xkcd link: https://m.xkcd.com/386/ )

.. Now i got a problem: where do i get any news from?

(updated in 2024)

The final nail to my online participation came with the Israel–Hamas war of 2023... I was especially touched by the fact, that the HN crowd managed to look over/ignore the Hamas atrocities. Maybe they are all just echoing the biases of philg, their master, who knows...

---18/08/22 09:20:39----------------------

A monorepo is when you have many projects in the same source code repository.

This approach simplifies the handling of dependencies, as almost everything is living under the roof of the same source code repository. However not everything is rosy. It is relatively easy to find the code for the right version of your library dependency - it's right there in the source code repo!

i think that one problem of this approach comes with continuous integration testing - you need to have the unit/integration tests that run with the continuous integration build on the repository, and that can add up as you keep adding projects to the monorepo. Now you need to run all of the tests for every commit, there is no way to determine a minimal subset of the tests, that should be run upon any given change! That can possibly make a very long test cycle.

The alternative would be to have a hierarchy of sub projects, and any change would have to be promoted along this path, until it is accepted into the global project. Frankly that doesn't look like a lot of fun...

I wonder if there is a good solution to this problem, while having it all in the same source repository. I suspect that any bigger project will inevitably run into problems of scale, like the one explained above.

--

"Well, then what can a poor boy do / Except to sing for a rock and roll band?"

In the words of Mick Jagger...

---21/05/22 03:40:55----------------------

I want to build some python + c based stuff (based on glibc) in a docker:

Python wheels uses the manylinux docker image, this was based on CentOS (as of 2022 they depend on CentOS 7 - that one will reach end of life in 2024), but RedHat killed CentOS, and that step makes it hard to trust anything that comes out of RedHat (sorry fedora).

Alpine is a very small linux distribution, people like to build small docker images basted on that distribution. However there is a catch: it is based on musl standard C library, and some projects have a problem when they are compiled with musl, as this standard library is slightly different from glibc. There is a problem, if the project that you want to build is having an incompatibility with glibc (that happens)

There is a thing called flatpak on alpine, but it doesn't help when you need to compile a project that uses the function missing in musl stdlib (flatpak helps with running executabels)

/and then all this goodness will break after a few months anyway, stuff is so brittle, nowadays/

Interesting fact: maintaining a linux distro is a lot of work, fedora was doing a good job, in general. Alpine is ok for most cases, it turns out that alpine is sponsored by Docker... (turns out they all need a big sponsor, in the end...)

See here

""" When the Docker community was looking for a distro they found that they could make their containers significantly smaller if they used Alpine and that building the Alpine containers takes a fraction of the time compared to other distros.

“This made the Alpine community grow very fast the last 2 years. Even Docker Inc. got the eyes up for Alpine so now Docker sponsors the development of Alpine,” """

/with linux you may have more choice, but that also seems to go with more politics... go figure.../

---20/05/22 02:42:38----------------------

Some say that gerrit is a better code review tool than what is being offered at github, would have to play around with it... Here the main focus is on a commit, now what happen if there are multiple commits? Does the thing go down in chaos? Would need to play around with it...

---20/05/22 02:42:37----------------------

Interesting thought: Github/Microsoft is pushing the copilot product, this one is supposed to tell you how and what to code, however the code review tool on github sucks deeply, and they don't seem to be working on that one. Fixing the code review tool should be a much simpler task, imho...

The deficiencies that I found with the github code review tool:

  • The "conversation" screen of a pull request is enormously cluttered, hard to understand what is going on.
  • I am working through the list of issues, and marking the checkbox on issues as I go along, in order not to loose track of where I am. It turns out that the reviewer is now receiving a notification, when I click on that checkbox! Now he sees the code prior to the changes being committed, which does not create a good impression...
  • Issues being collapse for no apparent reason, also not being able to reply in some cases, when the system thinks that the code has changed, or things have moved.
  • there should be some protocol that goes like this
    • the submitter marks an issue as done
    • the reviewer approves of the change
    • the issue should then be moved to some other screen, so as not to clutter the "conversation" screen. (You could do that by the editor deleting the review comment, but they don't do that, as it gives the appearance of no work being done).

I wonder if someone has made an alternative UI for code reviews, one that would be better suited for the task...

---19/05/22 09:18:58----------------------

see my notes on studying code review processes link

a funny picture on the subject:

code review

---08/05/22 21:57:10----------------------

Another bash trick:

err_report() {
    set +x
    echo "Error on: ${BASH_SOURCE}:$1 : "$(sed $1'q;d' ${BASH_SOURCE})
    exit 1
}
trap 'err_report $LINENO' ERR

That's slightly better than set -e - if a command fails, it also shows the file:line: line-in-script that failed

Also: instead of set -x in a top level script, do the following:

      export PS4='+(${BASH_SOURCE}:${LINENO}) '
      export TRACE=1
      set -x

That will show the file:line) before each executed line.

Now the trace environment variable can be propagated to dockers with (note that PS4 is not empty, even if not in use! That's why we have TRACE)

docker build --build-arg TRACE="$TRACE" ....
docker run -e TRACE="$TRACE" ... ```

(don't ask me, why docker build and docker run have different options for passing on environment variables)

Now the child script can also set it's tracint, conditioned on the presence of the TRACE env variable that was passed from the top level

if [[ ! -z $TRACE ]]; then
    export PS4='+(${BASH_SOURCE}:${LINENO}) ' 
    export TRACE=1
    set -x
fi

---30/04/22 11:43:11----------------------

I got a new M1 mac at work, suddenly it turns out, that running a docker is turning into rocket science. The aim is to run a basic alpine docker image, with the home directory mounted into the file system of the docker image:

The --platform linux/amd64 option tells docker to run the docker image in an x86_64 emulator. So that's more of a VM than a lightweight container...

(Interesting side note: you can run a x86 executable on a M1 mac, that's because the OSX operating system has the Rosetta component that translates the instructions of an X86 executable into M1 instructions. The docker thing would have to do a corresponding emulation/translation, whatever you call it)


# on the host
> uname -m
arm64

> docker run --rm --platform linux/amd64 -it --entrypoint /bin/sh -v $HOME:/var/home alpine:3.12

/ # uname -m
x86_64

Luckily you can inspect the architecture of an image

> docker image inspect alpine:3.12 | grep Architecture
"Architecture": "amd64",

However it is possible to run a native docker image on this M1 mac.

> docker run --rm  -it --entrypoint /bin/sh -v $HOME:/var/home arm64v8/alpine

/ # uname -m
> 

You can also get the architecture with jq (hope the json format doesn't change...) For met its:

> docker image inspect arm64v8/alpine:3.12 | jq -r 'first | .Architecture'
arm64

> docker image inspect arm64v8/alpine:3.12 | jq -r 'first | .Os'
linux

And it's possible to be more specific about the version tag of the image

docker run --rm  -it --entrypoint /bin/sh -v $HOME:/var/home arm64v8/alpine:3.12

Again, you can inspect the architecture of an image:

> docker image inspect arm64v8/alpine:3.12 | grep Architecture
        "Architecture": "arm64",

Now of course you will get some script here, that tries to make life easier by abstracting all this madness (but that script has probably a dozen or so bugs)

Here is the link to the script runimagelocal

Here is the help text

Usage: /Users/michaelmoser/bin/runimagelocal -i <image name>  [ -s <shell> ] [ -m <mount_dir>]

Runs a container interactively with the docker image, by default using the shell /bin/sh.
Can override the shell with the -s option)

The HOME directory is mounted to /var/home in the container (can override with -m option)

If no image is given then a default native linux image is run in a container, where the home directory is mounted into the container. Otherwise all the details mentioned earlier are performed:

  • the architecture of the image is determined
  • if it isn't a native image, then the --platform option is set.

Some say that adding another script like this one is just about adding another detail that needs to be remembered, however for me its a way to figure things out. Also it saves some time, I really think that!

---28/03/22 09:29:31----------------------

Remembering the details of golang, i wasn't exposed to this language for quite some time now... here is my exercise on the subject.

other notes on golang here

Golang has some very confusing semantics, paired with very strong opinions on how a program should be written. Interesting that the same authors of golang were the authors of C, which is much less opinionate

later: asked ChatGTP about this problem: link

/i find myself talking a lot to ChatGTP. Maybe that's a sign of ... old age :) /


How is it that the same authors of C wrote golang, which is a very opinionated programming language.

ChatGPT The authors of the C programming language, Ken Thompson and Rob Pike, were indeed involved in the creation of Go (Golang), but it's important to understand that Go's development involved a larger team of contributors and designers as well. While Ken Thompson and Rob Pike brought their experience and insights from working on C, Go was developed with different goals and priorities in mind.

Go was designed to address certain shortcomings and challenges that developers were facing, particularly in the context of concurrent programming and large-scale systems development. To achieve these goals, the designers of Go made deliberate decisions to introduce certain opinions and conventions into the language. These opinions and conventions include things like:

  • A preference for simplicity and readability in code. Built-in support for concurrency primitives like goroutines and channels.
  • A standard formatting tool (gofmt) to enforce a consistent code style.
  • A package management system (modules) to manage dependencies.
  • A strong emphasis on performance and efficiency.

These opinions and conventions help shape the way Go code is written and organized, making it more opinionated compared to C. However, it's worth noting that Go still allows for flexibility in many areas and doesn't enforce opinions as strongly as some other languages like Ruby or Python.

In summary, while the creators of C were involved in the development of Go, the two languages have different design goals and priorities, leading to differences in their level of opinionation.


Makes sense: golang is making lots of behind the scenes - like deciding if a value is placed on the stack vs the heap, go thread scheduling, garbage collection. Unlike C it is no longer a relatively direct mapping between the code and the hardware that is running it. This means that there are many things in the style of the program that can affect the result, therefore go tries to prescribe all of these details in a kind of rulebook...

---20/03/22 00:06:13----------------------

Today i learned about java: (i tend to learn new language features, when being exposed to some new code base, never bothered to keep up with all the changes of the java language...)

https://stackoverflow.com/questions/37583464/what-is-round-brackets-parentheses-in-try-catch-in-java https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html

The python with statement has an equivalent in java, see the try with parenthesis syntax; the BufferedReadable must support the AutoClosable or Closable interface (must have a close method)

try (BufferedReader br =
               new BufferedReader(new FileReader(path))) {
    return br.readLine();
}

I see that often, that some language feature is inspiring/appearing in multiple programming languages... (interesting if there is a term for that)

A java lambda with empty parameters () can be used as a Runnable instance; strange but true... https://stackoverflow.com/questions/25192108/what-is-the-breakdown-for-javas-lambda-syntax For example it can be submitted to a thread pool....

---23/01/22 03:13:03----------------------

Today I was told a very nice joke, on my front-page here

 ________________________________________
/ Q: How many Harvard MBA's does it take \
| to screw in a light bulb? A: Just one. |
| He grasps it firmly and the universe   |
\ revolves around him.                   /
 ----------------------------------------
        \   ^__^
         \  (..)\_______
            (__)\       )\/\
             -- ||----w |
                ||     ||
nltk score: -10 '(from -100..100) state: 'neutral' eye: .. tongue: --

The script that updates the front-page is running this little project; the text is taken from the fortune program, and piped into cowsay, to display a nice ascii-art cow. Now I am also running it through nltk based sentiment analysis, a good score makes the cow smile, a bad one makes her frown. However in this instance the system failed to appreciate the subtle humor of the joke, that's the kind of robots we have right now.

Now we do have smarter robots, Both IBM Watson and Google search are example of systems that take a more layered approach, these systems are dissecting a complex problem into many tiny aspects, and they manage to integrate these tiny aspects into a coherent picture. However it takes a lot o people and enormous amounts of resources to do the trick. The next logical step would be to simplify the construction of such systems, but that is clearly beyond the scale of this little blog (it also seems to be beyond the scale of any company, taken in isolation)

The micro service architecture can be understood as an attempt to standardize the process of dissecting a big problems into relatively simple component parts. The real objective of the micro service architecture is to have a system that is vertically scalable, so as to make the system capable of servicing a very large workload. Maybe a similar approach is required in order to simplify and standardize the layers of data processing and knowledge processing pipelines. That would probably be a process that would require a sustained effort over many years, and it would most likely require many attempts and failures in order to find something workable.

In a sense I am also grasping that light bulb, here in this blog entry, and waiting for the universe to revolve around it ;-)

... on a related note: it takes even fewer Zen masters to screw in a light bulb, but the Zen master has the same approach as the Harward MBA...

 _________________________________________
/ Q: How many Zen masters does it take to \
| screw in a light bulb? A: None. The     |
| Universe spins the bulb, and the Zen    |
| master stays out                        |
|                                         |
\ of the way.                             /
 -----------------------------------------
         \   ^__^
          \  (..)\_______
             (__)\       )\/\
              -- ||----w |
                 ||     ||
nltk score: 0 '(from -100..100) state: 'neutral' eye: .. tongue: --

I found some resources, to get some hint of what I am talking about: awesome datascience and datascience for beginners, it seems that searching for 'awesome anything' on github will give you a great intro to any topic...

---11/12/21 13:01:21----------------------

I am a long time reader of hacker news, It's an incredible source of information, but it too has its editorial biases. For example this hightly interesting discussion appeared on the front page the other day, but then suddenly disappeared from the front page again, and also from any other page that appeared in the vicinity of the front page on that day. I am not quite sure, as to why that happened.

Anyway, it's probably time to diversify my information diet, and the source is right here! When logged into github, the main page is showing all the repositories, that were recently starred by any github user, who got a star from me. That is an excellent source on what is happening here! And so it goes, that I am starring everyone, who was so kind to put a star on any of my projects, partially out of curiosity, and a desire to learn new stuff.

You might also want to look at one the scripts that I wrote recently, Here, this script, is a building a README.md file, that is appearing on my github profile, by virtue of naming the repository after my github user. All this is run as a github action - as a continuous integration process, hosted by our (hopefully) benevolent hosts here. (More details in the about page)

One of the generated reports is a report of every repository that got a star from me.

Thank you all for keeping me up to date, on what is going on!

Clicking on the image will open a youtube video with the song 'Thank you all' by the 'Free Design'

'Thank you all' by the 'Free Design'

"Think and dream and share your mental wealth/The world is out to get us, but it can't/Because we're friends!"


Update: now my python tracer and bytecode course got some enormous visibility, after having been on the front page of hacker news! here. Thanks a lot! Not it is impossible to follow everyone manually, so i fixed this script and added the -f option to do that automatically. Sometimes i get so many followers, so that the script fails at one pass, as it exceeds a github api throttle on the number of rest api calls, but that's a good problem, in my book ;-)

---22/11/21 06:08:24----------------------

I am dabbling quite a bit in Python, in my public repositories, here on github. Python feels sometimes like a kind of lisp, Peter Norvig is often quoted of having said so. There are some similarities, due to python being very dynamic and expressive /also if you ask me: you need to overcome some barrier of entry with the syntax (for python it's that spaces have some real meaning, for lisp it's the nesting of lists), You are fine with both lisp and python, once you have overcome these difficulties/

One area where python is not lisp - python is not Homoiconic. This means that a python program can't be manipulated as python data. As a result, the python interpreter cpython needs to do quite a bit of parsing, it is time to look a bit at parsing in general and in the world of python.

Parsing comes in two stages: often there is a first stage, called lexical analysis/tokenization where the input text is broken up into tokens like keywords, identifiers and comments, and a second stage called syntax analysis, where they take the tokens as input, and build an abstract syntax tree, that shows the structure of the program. (in a homoiconic programming language, you would get the abstract syntax tree for free, in the form of the data structure that expresses the program)

Now some parsers try to do without the lexical analysis stage, for parsing expression grammars (PEG) are a type of top-down parsers that are famous for doing without a lexical analysis stage, but the result turns out to be a bit awkward:

I think that peg parsers have a lot of problems:

  • one problem; you need to deal with comments as part of the grammar, That can be a bit of a problem, as comments may appear just about everywhere. One way to fix that problem with PEG parsers is to have a filter stage, that removes tokens from the input, before they are handled by the parser, That would be a kind of preprocessor similar to lexical analysis: lexers are eating up comments, and just don't pass them on as tokens.
  • the peg parser needs to do a lot of backtracking, partly to resolve ambiguities between tokens - an example ambiguity between tokens: an identifier be identified by a regular expression [A-Za-z]([a-zA-z0-9\_]*) , now this regular expression can also scan about every keyword.
  • it needs a lot of lookahead, a whole lot.

Still the grammar for peg parsers is much more intutitive, and you don't have to fight shift/reduce and reduce/reduce conflicts (like with yacc/bison), or indulge in other workarounds.

Anyway, lets look at what cpython is doing: before python 3.9 had a simple grammar, the ideal was to have a LL(1) grammar, you would just look at one token, in order to decide what clause will be parsed. In practice they had a couple of workarounds, still it was an LL(k) grammar, one that can be parsed with a fixed number of lookahead symbols. up to python 3.9 the grammar was defined here and it had a lexical analysis stage with tokens defined here Most of the parser would be translated to a finite automata, that would be used for the syntax parsing. With python 3.9, the default parser is generated from this grammar definition, and it is generated into a PEG parser. Note that all the keywords are defined as string constants, that appear in the grammar definition file.

Lets build cpython on the mac:

On the mac you first need to get openssl, as a prerequisite. brew install openssl

Then get the sources and build it

    git clone https://github.com/python/cpython.git cpython
    cd python

    # for 3.8 (still has the old dfa based parser)
    git checkout origin/3.8 -b 3.8
    ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) CC=clang

    # for 3.9 (uses a new peg based parser)
    git checkout origin/3.9 -b 3.9
    ./configure --with-pydebug --with-openssl=$(brew --prefix openssl) CC=clang

    make -s -j2

Anyway, i think that parser speed is not that important to python - in the end, most of it ends up being translated into bytecode files with extension *.pyc, and the same bytecode files are used for subsequent runs of the same program. What really matters, is the speed of the runtime interpreter that runs the bytecode, that's where most of the time is being spent. The Python developers argue, that the PEG parser is within 10% speed of the previous table based parser. One reason being that the in-memory syntax tree produced by the peg parser doesn't need to be post processed. (I would guess, that the parser performance assessment would actually depend on many factors, like most things in software).

What is interesting, is that the switch to the PEG based parser in python 3.9 coincides with a whole set of changes in the python syntax here. Apparently it has become easier to add stuff to the language. I suspect, that this factor is the real motive behind the big switch in parsers.

For more info see here and here.

Also, the switch in parsers almost coincides with a change in governance of the python project here, maybe there is some correlation here, go figure...


I am not sure if the analogy between python and lisp is the correct one. Python is all about its object system, now just look at python meta objects; these are said to have come to us from smalltalk. So it would be fair to say that Python has a very strong ancestry in Smalltalk. Interesting that the wikipedia article on Python is not listing Smalltalk as having had an influence on python here On the other hand: smalltalk was very much influenced by Lisp, so they say. So maybe that was the meaning of this comparison...

Now here is my attempt to understand the python object modell, the Python object primer for python3 . Let's see how that works out...

What i learned from this: Python's metaprogramming facilities are built around metaclasses and decorators (these are explained in the linked course). These tools are built around reflection, and the dynamic nature of python. It is possible, that this is enough to offset the lack of being a homoiconic language, i am not sure about this...

---08/11/21 03:33:18----------------------

I was plagued by garbage characters appearing in vim. Actually i never get this on linux, only on the mac. To fix this problem somehow, i redefined ctrl+a to redraw the screen

:vnoremap <C-A> <Esc>:redraw!<Return>

:inoremap <C-A> <Esc>:redraw!<Return>i

:nnoremap <C-A> :redraw!<Return>

Luckily there is a fix for that explained here Now if you get this problem, then put this into your ~/.vimrc , it helps with the problem!

if has('mac')
:set t_RV=
endif

What would i do without stackoverflow...

Nope, i still need the shortcut on occasion, especially when searching for a string, the screen doesn't always get redrawn....

---06/11/21 09:37:45----------------------

Today i learned, that google is tracking your entire browser history in Chrome, is tracking your every move via android, is also recording your voice (not quite sure what is being recorded), among other things... It says that you can disable this here; though i am not entirely sure if you can trust that statement.

See more here

That's what I call 'opinionated snooping', the default is to track your every move. In a way I am glad, that i failed to pass the google hiring process, when I tried. I really wouldn't like to be part of these games, even though they are probably a cool place to work for. This mega corporation seems to have a vision of things, which is not quite pleasant: i mean the statement 'dont be evil' depends on your particular definition of evil. (I guess they would regard a bad financial quarter as very evil)

Among other things I learned: it turns out that Orwell did get a lot of ideas for '1984' from Zamyatin's 'We'; see Orwell's review Also here. I didn't quite find '1984' to be very convincing, when I read it; maybe Orwell did not experience too much of socialism, when he was in Spain, during the civil war there; however 'We' was much more real to me, when I was growing up in the German Democratic Republic, aka East Germany.

Sorry for all of the politics, this was meant to be a professional blog. Well, I think there is no way to get around that...

--

Here is a good one: 'there ain't no anti-utopia, that our current elites wouldn't want to implement for real' source

Unrelated: I made a crawler generated web page, that lists flagged submissions and comments from hacker news Here is the result: 'red flagged hacker news' and the crawler for this stuff is right here (I tried to make the logo look like that of the GDR newspaper the 'Neues Deutschland', giggle)

Update: google at least gives you an 'opt out' clause (not sure if they really do an opt out, or if they have to keep the data because of some other considerations), other giants of the surveillance industrial complex don't do that (facebook doesn't have an opt-out clause, for that matter)

--

A friend of mine managed to appeal a parking ticket, based on the location data that is tracked by android snooping. Maybe he is right and i am wrong, maybe we get a good deal, by giving up our privacy in return for a better content suggestion on youtube, or, as in this case, in return for the ability to proove our whereabouts, when needed. However me doesn't want to feed the monster with more data than it is necessary, also everything that 'listens and looks' aka. 'Horch und Guck' is just a bit too frightening, at least to me.

--

At least vim isn't calling home (unlike visual studio code, with it's telemetry).

Anyhow: I have a vim plugin that uses openssl to encrypt a file with a symmetric key (aes-256 or other ciphers) link, that works for me.

This can be used to keep your own private notes secured while saved on disk/save at rest (well, they can still put up a keylogger, or read the editors memory, if they really bother)

Anyhow, this vim plugin is a kind of technological fix, for a problem that is posed by society. (Well, most of the entries in this log go with 'here is a script to fix your woes' ;-)

---25/10/21 14:43:56----------------------

While monitoring a log file that is changing;

tail -f mysterious-log-file.log | grep --line-buffered SEARCH_TERM

tail -f shows any changes of the log file, now without the --line-buffered argument on grep, you will not get an update of any new events. (stepped on this several times, let's see if I remember it the next time)

---10/10/21 04:46:52----------------------

I made a presentation for the gitblame vim plugin link to project

Now making a presentation based on screen recording is not quite a trivial thing to do on a mac, there are many options here:

  • quicktime can record the screen, but your video will be without sound (at least I didn't manage to record with sound)
  • There are several screen recording programs that claim to be free, but it's only a demo version that requires you to pay for a license.
  • Both zoom and slack could possibly do it, but I didn't manage to make it work. Once upon a time there was google hangouts, where you could do a screen recording. However this product was discontinued by google, for whatever reasons.
  • Didn't manage to use ffmpeg and gstreamer, though I had some fun with gstreamer, while trying: this command gives you a distorted mirror, quite funny: gst-launch-1.0 avfvideosrc device-index=0 ! videoconvert ! warptv ! videoconvert ! autovideosink
  • Now what really worked for me is the Screen recorder for Chrome This plugin is really free (right now) and it allows you to record a particular window (not the whole screen). You can also pause and resume the recording, which is very useful for someone like me, who is not a seasoned presenter.

Lots of options here. It took me some research to find something that works for me, as usual...


Another neat Chrome browser plugin is Markdown Preview plus ; it allows you to view local markdown files in Chrome, rendered as html. After installation, you need to visit the url chrome://extensions/, press on details for 'Markdown Preview Plus' and enable the option 'Allow access to file URLs'. Works like charm (previously, i was pushing a change to a markdown file to github, just to view the change rendered, that creates very convoluted commit history, mind my French ;-) )

---05/10/21 02:38:15----------------------

It is possible to use the jdb debugger, in order to get a stack trace of a java thread that got stuck, for a process running in a container of a pod of a kubernetes cluster. Every tool has its uses...

On the server side:

  • The java process needs to be invoked with the following command line arguments: -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000 this means that the process listens on port 8000 for remote debugging requests.
  • The kubernetes cluster needs an ingress on port 8000

On the client side;

  • forward the debugging port to the pod that contains the java process that is to be debugged. kubectl port-forward POD_RUNNING_JAVA -n POD_NAMESPACE 8000:8000 While forwarding is running in one console, start a new console to run the following commands:
  • Run jdb and attach it to the process via the port: jdb -attach localhost:8000
  • run the following commands iin jdb: threads - this one is listing all threads are listed in the form (<Thread_name>)<thread_id> where the thread_name is what you pass to Thread.set_name
  • suspend - suspend all running threads.
  • where <thread_id_hex> show stack track of remote thread

jdb is a very low level java debugger, but it has it's uses.

<---10/09/21 14:26:50----------------------

One interesting aspect in software development is Software rot, or otherwise known as bit rot.

I came across this thing in one of my project, kind-helper. The project is a helper script for setting up a kubernetes test cluster with the kind utility. One of the features of this project is to set up an ingress (you can optionally set up a tls/https ingress). The problem was, that the format of the ingress object changed, therefore the whole edifice seized to work.

However I did make some use of github actions - they give us the chance to set up a continuous integration environment, that may test a github probject with public access on a regular basis. I have managed to use this feature, granted to us by our benevolent owners, in order to test the kind-helper project on a regular basis. Therefore I have managed to get some notice of the fact, that the integration tests for this project seized to work. Now if you ever find someone bitching about automatic tests - like unit tests/integration tests, then my response would be, that this technique will give some notification of any changes in the environment, which may affect your software. I think that that there are a lot of possibilities for environmental changes, most of them come down to changes in dependencies (libraries or components that your components depend on). These changes are a very frequent cause of errors in software development.

---26/08/21 23:16:37----------------------

I am looking a bit into the Rust programming language; i think it is interesting that this language did not gain wider acceptance until 2021, right now there are very few job openings that require Rust.

Compare this to golang, which did gain widespread acceptance between 2012 and 2021 (in 2012 they had version 1.0 of golang released); Rust's version 1.0 came out in 2015.

What they have in common

  • Both languages have a killer feature over their competitors.
    • Golang is competing with the JVM platform; golang has go routines (easier concurrency with n:m threads, n cooperative threads running in m operating system threads) and compilation to standalone executables (unlike JVM based languages it does not have the start up time, when the program is running in interpreted mode, without having completed just in time compilation); Golang is competing with Java/JVM based languages for application server stuff.
    • Rust is competing with C++, has the killer feature of save code, a guarantee against buffer overflows in save code. Rust has a very rich syntax, it is inspired by Ocaml, it has a switch statement that can do deep pattern matching (like in Scala), and has a very flexible macro facility.

Where are the differences?

  • It is very easy to pick up golang for someone who has worked with java, It has the backing of Google, which has a lot of clout in software development (even though they are not known as development tool vendors)
  • Rust is introducing very new features, and has a steeper learning curve. Also Mozilla fired the team responsible for developing Rust in 2020 here - it was later picked up by the Rust foundation, but the whole process raised doubts abut the future of Rust.
  • I think that golang has a better defined niche with a lot of demand; it is targeting server software development in a managed language; The niche of Rust would be security sensitive code running as a native executable, but also server code that has to be secure and better performing than managed code. I guess perceived ease of development and deployment wins over more efficiency and more secure code.

I suspect that there are several factors that determine programming language adoption

  • The viability of the platform,
  • The availability of programmers for a programming language is the single most important factor in programming language adoption. I think the continued popularity of C++ and Java has much to do with the fact; an enterprise wants to treat programmers as interchangeable screws, and a less widespread programming language would make this practice much more difficult.
    • C++ was designed to be easier to learn by means of backward compatibility with C; This decision was good for programming language adoption; it was a good trade off, even at the expense of being a source of many issues with the language.

Interesting corollary: I think that an enterprise with a less common programming language (like Scala or Rust), would have to treat its programmers much better than a competing shop that uses a commonly used platform like Java or C++; they have a greater investment in their workforce, due to the language/platform issue, are probably more likely to raise salaries every now and then and would be less likely to 'hire and fire'...


Still, i think that Rust will gain some acceptance, at some stage. Consider these projects: rustls and ring ; they aim to reimplement libssl/libcrypto in Rust, so that you will not be likely to see any buffer/stack overruns...

---13/08/21 14:31:52----------------------

Other essential kubectl commands, really helpful ones for solving real problems with kubernetes usage.

This lists all events that happened in the namespace; Pods starting and crashing, etc. etc.

kubectl get events -n NAMESPACE

What is going on with my pods? Who is crashing and restarting? This lists all the bad things that happened in the namespace;

kubectl get events -n NAMESPACE | grep -i failed

This shows all the logs in a container of the crashed pods (name of crashed pods is in the event log produced by previous command); note that you have to give the name of a container name.

kubectl logs --previous POD-NAME-OF-CRASHED-POD -n NAMESPAE -c CONTAINER-NAME-IN-CRASHED-POD

If you don't know the name of a container in the pod, then then the following command will remind you:

kubectl logs --previous POD-NAME-OF-CRASHED-POD -n NAMESPACE

Now there is the command of pure magic, this one gives you an ssh shell to a container in a running pod.

kubectl exec --stdin --tty POD-NAME -c CONTAINER-NAME -n NAMESPACE -- /bin/bash

Of course this command may fail, if the container does not have the bash shell installed. In this case, the following command tells you which shell is installed (doesn't work if the container does not have a shell installed, which also happens)

kubectl exec --stdin --tty POD_NAME -c CONTAINER-NAME -n NAMESPACE -- echo $SHELL

---27/07/21 03:04:38----------------------

The following code used to work in python2.7, but it doesn't work in python3.

You can't iterate over the keys of a dictionary in python3, and modify the underlying dictionary within the loop.

In python2 this worked just fine: the method keys() of type dict was returning a list object that is holding a copy of the keys

in python3 they return a special set like view into the keys of a dictionary, this set is of type dict_keys, this speeds up iterating over the keys of a dictionary, but now you can't modify the collection while iterating over it.

map={ "first":1, "second" : 2, "third" : 3 }

print("keys type: {}".format(type(map.keys())))

for k in map.keys():
    if k == "second":
        # in python3 this gives the error; RuntimeError: dictionary changed size during iteration
        del map[k]

print(map)

In python3 you have to copy the keys explicitly into a list, to get the old behavior.

map={ "first":1, "second" : 2, "third" : 3 }

print("keys type: {}".format(type(map.keys())))

for k in list(map.keys()):
    if k == "second":
        del map[k]

print(map)

Also the built-in map function returns an iterator like object, so that it does lazy evaluation in python3, in python2 you would get a list as return value. This has the advantage that the result of map is evaluated lazily, only when it is needed. On the other hand you can iterate only once over the result of map, unlike python2.

def inc(x):
    print("calling inc with ", x)
    return x + 1

res = map( inc, [1,2,3] )

print("return type of map", res)

print("first iteration:")
for n in res:
    print(n)

print("second iteration:")
for n in res:
    print(n)

I used to step onto this rake repeatedly in the past, let's see if writing it down will prevent a similar repetition in the future.... Another lesson: some say that 'python is easy', however python keeps changing with every major and minor release, which is confusing (perl5 doesn't do so, therefore perl scripts have a better chance to keep running without change; There comes a trade off with every decision, in the land of computers... Similar trivia: range used to be function in python2.7 that used to return a list of numbers in python2.7, in python3 it is a type constructor that is returning a range object, this uses much less memory.

A good collection of differences between python 2 and python 3 is here


I later wrote this article on the python module system here

---22/07/21 04:05:11----------------------

Wrote a python script that gathers logs in order to identify a situation described in the previous entry link to script

The scripts help text:

usage: follow-kube-logs.py [-h] [--namespace NAMESPACE]
                           [--deployment DEPLOYMENT] [--stset STATEFULSET]
                           [--rset REPLICASET] [--out OUTDIR]
                           [--kubectl KUBECMD] [--trace] [--complete-bash]
                           [--complete]

This program starts to follow the logs of containers in all pods of a
kubernetes deployment/replicaset/statefulset. The output is written to a file
per container. The script then waits for user input, logging is stopped once
the user has pressed enter.

optional arguments:
  -h, --help            show this help message and exit

log  pods/containers in either one of deployment/replicaset/statefuleset:
  --namespace NAMESPACE, -n NAMESPACE
                        optional: specify namespace of deployment (default: )
  --deployment DEPLOYMENT, -d DEPLOYMENT
                        name of deployment (default: )
  --stset STATEFULSET, -s STATEFULSET
                        name of statefull set (default: )
  --rset REPLICASET, -r REPLICASET
                        name of replica set (default: )
  --out OUTDIR, -o OUTDIR
                        mandatory: name of output directory (default: )
  --kubectl KUBECMD, -k KUBECMD
                        optional: name of kubectl command (default: kubectl)
  --trace, -x           optional: enable tracing (default: False)

suport for bash autocompletion of command line arguments:
  --complete-bash, -b   show bash source of completion function (default:
                        False)
  --complete, -c        internal: used during code completion (default: False)
  --kubectl KUBECMD, -k KUBECMD
                        optional: name of kubectl command (default: kubectl)

So that by running ./follow-kube-logs.y -n my-namespace -p my-deployment -d logdir you will create directory logdir, it will create a subdirectory for each pod running in the deployment my-deployment in namepace my-namespace and spawn a process that follows the logs of each of the containers for that pod, to gather the logs for that container, while the script is running. The script then waits and asks for the user to press enter, whereas it will kill the spawned processes and stop the logging.

The purpose of this script is to be a more lightweight solution then to use prometheus/graphana for viewing your deployment logs, as it is sometimes easier to grep through the logs, as compared to writing elaborate prometheus queries.

In a way kubernetes is our generations jcl (job control language on ibm mainframes) ; There is a remote similarity in how we are writing descriptors for tasks and then submit it for execution and wait till the mainframe/kubernetes cluster has considered our specification. (suddenly feeling old because of this comparison ...)

update: changed the script, now it scans the deployment once per second for changes, new pods are logged too.

update: added command completion, now to enable command completion, place the script in the path and run follow-kube-logs.py -b >>$HOME/.bashrc

update: made a repository for this script link

---19/07/21 03:25:02----------------------

Another well known rake: a client library that does it's own caching is used in an implementation of a service that runs in multiple instances.

Instance A gets a modify foo request, followed by a get foo request, naturally the modification of foo will put the correct result in the cache of A, so that the subsequent get request will return the correct value. Instance B receives a get request for foo and returns the cached incorrect value of foo.

Recently stumbled at this with the Okta java sdk - here it was not obvious that the client is doing its own caching. (another rake: the github project does not include all the sources, like very important classes like DefaultClient and DefaultUser; luckily you can see them with any good java decompiler, like IntelliJ)

Luckily it is relatively easy to cancel the caching, by supplying a custom CacheManager instance to ClientBuilder, one that does return a non-caching Cache instance on each request)

Another interesting case is the aws KMS client, that has it's own cache, but where there is no set default for the cache expiration; so that i ended up with a cache that does not do any caching....


actually this would be some interview question: 'how would you deal with a caching client library used in a multi pod deployment of a service, describe the problem scenario and solution', as there is more then one way to solve this problem.

  • One could cancel caching of the client.
  • in the case of Okta: one could rewrite the code to use the REST api - this doesn't come with its own client side caching.
  • One could use a special router to route all requests to change any specific entity to the same service instance
  • one could write such a routing logic as part of the service, and delegate work with the caching client to a third service, based on this logic.
  • etc. (probably a few other approaches as well).

(hope that i won't have to answer my own job interview questions any time soon, interviewing for a job gives me post traumatic stress disorder ;-)

---07/07/21 04:54:26----------------------

Difference between java streams: map and forEach.

I recently had the following pipe in a java program:

var result = myList.stream().map(...).filter(...).collect(Collectors.toList());

Now after some revision it changed to

 myList.stream().map(...);

Now that one is never executed. The error of course is that this statement does not produce a return value, the return value is not used and therefore ignored, and that's the reason why the statement is never executed. The source of this error, of course is that map may not be fully funnctional in that it may call other functions that do thinks like storing records in a database (an implicit side effect that doesn't modify any variable) That's the reason why they have forEach - this construct doesn't have a return value and is not ignored, as a result....

That's part of the fine print of mixing functional and object oriented paradigms.

---01/07/21 01:43:59----------------------

The macbook keyboard is ... not very long lasting; after a year or so you get failures of some quite important keys. For me these are the arrow keys. Apple says to blow some air at the keyboard, but my fiddling with the vacuum cleaner made it even worse ;-)

  • on GUI tools, like IntelliJ, you can keep going with the trackpad; now you have to make the scrollbar visible all the time:
    Select the Mac icon (in the top left corner)
    Select System Preferences
    Click on General
    Set the "Show scroll bars:" option to Always  

(if i could only make the track button black, so that it will always be visible at all times, but that's too much to ask for...)

  • for vim there is a second workaround: in normal mode one can navigate with the keyboard:

      hjkl  : h for left key, j for previous line, k for next line
       bw   : b for previous word, w for next word (these make for some fast navigation!)
       0    : start of line
       $    : end of line
       G    : end of file
       :0   : start of file
    

And there is much more... more I didn't manage to remember most of the stuff...

And you can remap some, i remapped m to page up an n to page down

:map , <PageUp>
:map . <PageDown>

but the default vim PageUp and PageDown commands suck a bit; if you do a page down and then a page up then you will not land on the same line. So lets do some scripting:

:map ,  :MyPageDown<Return>
:map .  :MyPageUp<Return>


command! -nargs=* MyPageDown call s:RunMPGD()
command! -nargs=* MyPageUp call s:RunMPGU()

function! RunMPGD() 
    let s:pagesize = winheight(0)
    let s:filesize = line('$')
    let s:topline = line('w0')

    let s:move = s:pagesize  
    if s:topline + s:pagesize > s:filesize
        let s:move = s:filesize - s:topline
    endif

    "execute "normal" . s:move . "j"
    let s:curline = line('.') + s:move
    let s:col = col('.')
    call setpos(".", [0,  s:curline, s:col ] )
endfunction

function! RunMPGU() 
    let s:curline = line('.')
    let s:pagesize = winheight(0)
    let s:topline = line('w0')
                
    let s:move = s:pagesize
    if s:curline < s:pagesize
        let s:move = s:curline
    endif

    let s:curline = s:curline - s:move
    let s:col = col('.')
    call setpos(".", [0,  s:curline, s:col ] )
endfunction

i guess that's why tools like IntelliJ have a vim emulation mode - to compensate for a broken macbook keyboard, now the keyboard can go on, until the : character is no longer...

In vim one can customize everything, it just takes a lot of time to do so, and when you are done then the result feels like a pyrrhic victory... One problem is that i don't know how to scroll in visual mode, so that the selection is still kept...


Several months later I found the solution of how to keep the selection in visual mode too source. Maybe it would have been easier to buy an external keyboard, rather than to compensate for this problem with scripting...

:vnoremap ,  :call RunMPGDV()<Return> 
:vnoremap .  :call RunMPGUV()<Return>

function! RunMPGDV()
    normal gv
    call RunMPGD()
endfunction

function! RunMPGUV() 
    normal gv
    call RunMPGU()
endfunction


Another interesting problem: VIM MAPS j to scrolling down, whereas me would find it more natural to scroll up. I left it as is as an exercise: how easy will it be for me to adjust to a changed assumption in an ingrained cognitive bias. The experiment is still ongoing...


You can also use vim instead of less:

So instead of

find . -name foo | less

It is now

find . - name foo | vim -

So it is the following alias for me, note that vim doesn't quite work with color escapes, so lets fall back to less when colors are asked for: Some people advise to use the AnsiEsc vim plugin for color escape codes, but this didn't work for me.

function _less {
    local arg="$1"
    if [[ "$arg" == "-R" ]]; then
        less -R
    else
        vim -
    fi
}

alias less=_less

One downside: this version of less will wait for the program to complete, before it shows anything. So it is not quite ideal for very long running programs. Everything seems to have its caveats, in the land of computing...


Another well kept bash secret is to use Ctrl-r for searching the command history

---30/06/21 09:34:35----------------------

A unit test that starts to listen on a grpc service stub, now if you immediately start to send requests to this service stub, then on some environments the stub might not be ready yet to receive the calls; so you need to add a seconds sleep between init of the server stub and sending requests to it. Bother!

java grpc has a thread pool dedicated to it's stuff, and a threads in this pool are listening and servicing the networking calls. Now with the thread pool there can be a slight delay between initiation of something and its execution. Normally the exact point of time where a grpc service will be up is not a big concern, but when used in this unit test scenario it is an important detail.

   try {
            fooServiceStub = ServerBuilder.forPort(nFooPortNumber).addService(new FooServiceStub()).build();
            fooServiceStub.start();
        }  catch(IOException ex) {
            ex.printStackTrace();
        }
       

it's amazing how much of my time at work is spent with dealing to fix failures that happen during CI tests. You sometimes got amazingly different behavior of the test environment on your local machine vs the environment on the CI environment; often these can be explained due differences of dealing with timeouts, or not waiting enough for some service, etc.

---30/06/21 04:39:54----------------------

Today i stumbled upon immutable lists in java. In java an attempt to modify an immutable list will throw an exception.

That is very different to immutable lists in Scala. Here the addition of an element to the list will return in a copy of the original list, with the element added (that's how they do functional programming, instead of modifying Foo, you create a copy of foo with the desired modification).

Compare all this to the C++ approach, which is still different. in C++ you can have a const references (or const pointer); if a function has been declared as const, then calling this function is not supposed to modify a given object; given a const reference one can only call the const functions of an object. (But you can have exceptions to the rule: if a member of a class is marked with the mutable keyword, then a const function can change it...)

It seems that they get less consistent about pureness and functional programming when looking at things from a lower level perspective (and more performance oriented perspective, that is)


Where did I stumble upon immutable lists? java grpc message objects are supposed to be immutable; you can't modify any field - if call an getter function that is returning a list, then this list is read only. If you need a slightly different grpc message object, then build a new one.

---27/06/21 04:21:10----------------------

Java has Optional , but I didn't manage to use the option chaining feature of Optional; it has many fine points. It gets very trick if you have to think about null values (which you have to); see https://www.sitepoint.com/how-optional-breaks-the-monad-laws-and-why-it-matters/

Optional is still useful if you don't try to chain Option with flatMap or map.

Funny thing this optional, it was supposed to solve nullpointerexception (you either have empty or something), but it creates a set of problems on its own...

Maybe that's why they like golang, there the standard library doesn't try to be too clever; but then it doesn't have functional stream processing on containers out of the box (streams are a time saver in java and scala; It is often less error prone to combine map/flatmap then to write a for loop).


https://www.sitepoint.com/how-optional-breaks-the-monad-laws-and-why-it-matters/

This is an important article, as it explains monads in terms that a java programmer can understand:

So that Optional is an 'almost monad' and java streams are full monads explained here

Here are my notes:

monads are: (hold your breath, this is an explanation for mere mortals ;-)

  1. you can think of a monad as a java parametrized type class Monad<T> with the following function

  2. unit in Haskell terminology is like a 'builder' function, that takes an element of T and wraps it in an instance of Monad<T> public <T> M<T> unit(T element)

    • for optional: unit is Optional.ofNullable ; this function makes an empty Optional for null argument, and wraps a non null argument in a non empty Optional value.
    • for stream: unit is Stream.of (? what about Stream.of(T ...values) ?)
  3. bind in Haskell terminology is like a flatMap in java: Bind receives a function f as argument, f takes an instance of T as argument; bind applies f to the Monad<U>

    • For java Optional bind (aka flatMap) extracts the contained T value, if it is present, and applies f to it.
    • for java Stream bin (aka flatMap) for each element of the stream, apply f to the element and paste the result of applying f into the output stream. (example; if f is the identity function, then this will turn a list of lists into a simple list that is the concatenation of all the list elements) Interestingly this is different from the bind in the previous example, where the result is a plain value of T, and here it is a Monad<T> by itself; confusing...
public static <T, U> M<U> bind(M<T> monad, Function<T, M<U>> f) {
    return f.apply(monad.wrappedValue());

For example: apply for an Optional is called only when the optional has a value; for a stream , flatMap replaces all elements of the stream with the return value of function f.

In addition to that there are the Monad laws (in Haskell terminology); these ensure that you can chain the flatMap (bind) calls together in perfect harmony:

The monad laws:

  1. Left identity law: bind(unit(value), f) === f(value)

The java Option 'monad' doesn't implement the monad laws correctly, that's why it is hard to chain processing with map/flatMap.

---26/06/21 05:41:07----------------------

While writing a recursive descent parsers:

Always ask yourself where the current position is after the end of parsing something. This is often mixed up, when the function for parsing a term is combined with others..

---24/06/21 09:44:42----------------------

gcc on osx isn't gcc... it's clang. (hideous, when a thing called foo isn't being foo, but something else instead)

the two of them spit out the same version info.

gcc --version

    Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/c++/4.2.1
    Apple clang version 12.0.0 (clang-1200.0.31.1)
    Target: x86_64-apple-darwin19.6.0
    Thread model: posix
    InstalledDir: /Library/Developer/CommandLineTools/usr/bin

clang --version

    Apple clang version 12.0.0 (clang-1200.0.31.1)
    Target: x86_64-apple-darwin19.6.0
    Thread model: posix

also

gcc
    clang: error: no input files0

brew install gcc

lets find where it is now:

# search for executables only, of a given pattern (on osx)
find / -perm +111 -type f -name 'gcc*' 2>/dev/null

    /usr/local/Cellar/gcc/11.1.0_1/bin/gcc-ranlib-11
    /usr/local/Cellar/gcc/11.1.0_1/bin/gcc-11
    /usr/local/Cellar/gcc/11.1.0_1/bin/gcc-ar-11
    /usr/local/Cellar/gcc/11.1.0_1/bin/gcc-nm-11
    /System/Volumes/Data/usr/local/Cellar/gcc/11.1.0_1/bin/gcc-ranlib-11
    /System/Volumes/Data/usr/local/Cellar/gcc/11.1.0_1/bin/gcc-11
    /System/Volumes/Data/usr/local/Cellar/gcc/11.1.0_1/bin/gcc-ar-11
    /System/Volumes/Data/usr/local/Cellar/gcc/11.1.0_1/bin/gcc-nm-11

/usr/local/Cellar/gcc/11.1.0_1/bin/gcc-11 --version 

    gcc-11 (Homebrew GCC 11.1.0_1) 11.1.0
    Copyright (C) 2021 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

that's more like it.

Now there is no way to change the link in /usr/bin/gcc to the actual gcc, not even as root (not very unixy, but apple can get away with it...)

but it works in /usr/local/bin

ln -sf /usr/local/bin/gcc-11 /usr/local/bin/gcc
ln -sf /usr/local/bin/g++-11 /usr/local/bin/g++

Interesting that the brew installation doesn't come with ld; hmm...

Also you must put /usr/local/bin in the path before /bin


Another big one: the built-in sed on the mac is kind of weird: it adds a trailing newline on every file, and is acting weird in every other sense as well.

Luckily we can install the regular gnu sed. brew install gnu-sed now you have to invoke it as gsed.


another one: if you need the gnu grep then you can get it with brew install grep, but you can only use it as ggrep, gegrep and gfgrep ...

---22/06/21 02:41:22----------------------

The term 'opionionated' as used in software see this discussion

Opinionated software means that there is basically one way (the right way™) to do things and trying to do it differently will be difficult and frustrating. On the other hand, doing things the right way™ can make it very easy to develop with the software as the number of decisions that you have to make is reduced and the ability of the software designers to concentrate on making the software work is increased

Google ngram says

google ngram viewer says the term took off during 2003, peak usage during 2009. An often cited example of 'opinionated software' is 'ruby on rails' or 'spring boot' (rails appeared in 2005)

this article says that spring boot is 'too opinionated'; it brings in a lot of dependencies (jpa, netty for grpc, etc. etc. etc.) he says that the choice of spring boot therefore determines a lot of choices up front... (like the versions of the dependent packages that must be used)

The opposite of 'opinionated software' is the perl philosophy of 'There's more than one way to do it' link (or here

I think that 'opinionated software' often starts where there are a lot of dependencies; then comes the 'spring boot' layer on top of this mess, that also binds you to a specific way of doing things, that is supposed to make life simpler. More centralisation may bring more certainty, but it drastically limits freedom of choice.

What is the difference between a library and a framework? The framework calls your components while it starts up, it has the main function (entry point). This is also called 'inversion of control' (everything in software must have a fancy name). Inversion of control . Interesting that perl used to have a lot of libraries on cpan, but not too many frameworks. However there are 'frameworks' that don't have their own main function, for example bootle in python doesn't do a thing like that (to be fair, bootle is a 'microframework' an not a proper 'framework')

---22/06/21 01:38:59----------------------

today i learned that okta is charging per created user https://www.okta.com/pricing/ that's how you do business, never thought of such an option...

---21/06/21 17:38:22----------------------

Magic alias that removes spaces from file names and puts _ instead of them. Works on osx! (from here )

alias nospaceinfilenames='for f in *\ *; do mv "$f" "${f// /_}"; done'

Other goodies: ps sorted by cpu and memory. also less -R can browse through output with colors! (should be a default behavior, in my opinion)

alias pstopcpu="ps -eo pcpu,pid,user,args | sort -n -k 1 -r | awk '"'{ $1="\033[31m"$1"%\033[0m"; $4="\033[31m"$4"\033[0m"; print }'"' | less -R"

pstopmem_usage="list processes with top memory usage on top (first column in red)"

alias pstopmem="ps -eo vsz,pid,user,args | sort -n -k 1 -r | awk '"'{ $1="\033[31m"$1 / 1000"Mib\033[0m"; $4="\033[31m"$4"\033[0m"; print }'"' | less -R"

also it helps to have this one in your .bashrc file; puts the current git branch in the prompt, so that you don't work on the wrong branch

  parse_git_branch() {
    git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/ (\1)/'
  }

  PS1="[\u@\h \W\$(parse_git_branch)]\$ "

The ultimate refactoring tool: search and replace a string for all files under git:

git ls-files -z | xargs -0 sed -i -e "s/FROM_STRING/TO_STRING/g" 

now for osx you need to use gnu sed, and not the built in one, as mentioned previously. This script takes care of it.

---21/06/21 16:55:34----------------------

stackoverflow search got much better. Once upon a time google search was preferable, now it makes sense to use stackoverflow search...

i wonder if they are still using elasticsearch only for search. Elasticsearch used TF/IDF algorithm, but stored the word count separately for each shard, so that you often gen inconsistent search results. (it needs and index of the frequency for each token/word to function)

Here they say that now they are also using Okapi BM25 here (but that one is also doing inverse term frequency)


Another useful tool is github code search especially if you use a few of the search operators explained here

One interesting usage, besides searching for a code pattern, is to search for interesting projects on github; also see searching for repositorie and searching for topics

stars:>=10 language:cpp in:readme games this searches for projects that have more than ten 'stars', are using the c++ programming language, and have the word "games" in the readme.

https://github.com/search?p=4&q=%22curated+list%22&type=Repositories searching for repositories with the phrase 'curated list' gives you a lot of cool link collections.

https://github.com/search?q=type%3Auser+followers%3A%3E1000 this query gets you a list of celebrity github users with more than 1000 likes.


What we learned from this: being a software developer means to be a master in the art of searching for stuff. I think that Google Fu is no longer sufficient, as the web has grown too big for one single company to handle (or maybe the focus of our masters at the big search company has changed, no way to be sure on that).

---21/06/21 16:30:32----------------------

show dependecies with gradle

    cd <directory of sub project>

    gradle -q dependencies

keeping all the dependency versions happy is a challenge....

---21/06/21 16:14:48----------------------

On the github page: a comment with /retest will cause the CI to rerun the build and test. (/rebuild doesn't work). This may be a feature of the particular CI scripts that we are using, don't know.


The CI is only keeping the standard output of gradle; here you need to find which test failed, so search for the string '() FAILED';

also one should search for STANDARD_OUT - this string marks start of each junit test; and the test runner likes to randomize he ordering of the tests...

---21/06/21 15:49:21----------------------

gradle, oh gradle.

gradle cleanTest test --fail-fast 2>&1 | tee log.log

  • without cleanTest it will not rerun the tests after a successful run (why?)
  • --fail-fast tell it to stop on the first test failure (much easier to debug the tests this way).\

I have the following function in my .bashrc file (and it's always better to keep a log of that compilation/test run)

function makeIt {
    if [[ -f build.gradle ]]; then
        cmd=$1
        if [[ $cmd == "" ]]; then
            gradle cleanTest  test --fail-fast 2>&1 | tee log.log
            #gradle cleanTest build  2>&1 | tee log.log
        else
            gradle "$@" 2>&1 | tee log.log
        fi
    else
        if [[ -f pom.xml ]]; then
            cmd=$1
            if [[ $cmd == "" ]]; then
                mvn compile 2>&1 | tee log.log
            else
                mvn "$@" 2>&1 | tee log.log
            fi
        else
            if [[ -f Makefile  ]] || [[ -f makefile ]] || [[ -f GNUmakefile ]]; then
                make "$@" 2>&1 | tee log.log
            else
                echo "don't know how to make this, yet"
            fi
        fi
    fi
    beep.sh
}

alias m='makeIt'

the m alias runs the relevant build command in the current directory; writes a file with the output, and makes a beep at the end; this signals to to get back from swordfighting


i have also written this script, which is github specific: makepr.py It automates creation of pull request, waits until the continuous integration build for the top commit has completed, then notifies you upon completion of the build.

---21/06/21 15:47:12----------------------

java will start to omit stack traces, without this option. -XX:-OmitStackTraceInFastThrow

Why did they do that? Why?

Another big thing is -XX:CompileThreshold

It turns out that this value is 10000 on HotSpot server, this means that a java function is just in time compiled only when it has been executed for 10000 times.

See this link

That's one of the advantages of golang, here there is no hard to predict just in time compilation, everything is compiled up-front, there is less of this strange warm-up time after starting a service. On the other hand, golang has executables of enormous size.

$ which kubectl
/usr/local/bin/kubectl

$ stat -x /usr/local/bin/kubectl
  File: "/usr/local/bin/kubectl"
  Size: 55           FileType: Symbolic Link
  Mode: (0755/lrwxr-xr-x)         Uid: (    0/    root)  Gid: (   80/   admin)
Device: 1,5   Inode: 2013626    Links: 1
Access: Wed Sep  2 19:20:39 2020
Modify: Wed Sep  2 19:20:39 2020
Change: Wed Sep  2 19:20:39 2020

$ ls -l /usr/local/bin/kubectl
lrwxr-xr-x  1 root  admin  55 Sep  2  2020 /usr/local/bin/kubectl -> /Applications/Docker.app/Contents/Resources/bin/kubectl

$ stat -x /Applications/Docker.app/Contents/Resources/bin/kubectl
  File: "/Applications/Docker.app/Contents/Resources/bin/kubectl"
  Size: 50104496     FileType: Regular File
  Mode: (0755/-rwxr-xr-x)         Uid: (  502/michaelmo)  Gid: (   80/   admin)
Device: 1,5   Inode: 17339265    Links: 1
Access: Thu Oct  7 10:12:44 2021
Modify: Wed Dec  9 17:24:19 2020
Change: Fri Dec 11 05:22:00 2020

Kubectl is 50 megabytes big, arguably that's quite a bit for this command line program.

---29/07/21 06:08:37----------------------

Visual studio code shortcuts: https://code.visualstudio.com/shortcuts/keyboard-shortcuts-macos.pdf

Command shift + ::: make resolution larger Command shift - ::: make resolution smaller

Close outline at the right of the screen (this one is called "minimap")

Command , ::: opens the settings page, search for minimap, uncheck 'minimap enabled'

---29/07/21 06:15:02----------------------

IntelliJ goodies on OSX. https://www.jetbrains.com/help/idea/mastering-keyboard-shortcuts.html

press two times shift in quick succession, then in the windows press the 'symbols' tab ::: can show the source of any symbol, has a good built-in decompiler!

Command B ::: go to declaration.

Command L ::: go to line number

VisualStudioCode F12 ::: Jump to definition/declaration.

VS Code got auto save disabled by default !!! IntelliJ has autosave on by default.

Ctrl+G  - goto line number