Skip to content

Conversation

@Abacn
Copy link
Contributor

@Abacn Abacn commented Apr 11, 2024

Fixes #1362

Action item of #1424 (comment)

Affected PRs are notified: #1392 #1376

Target to merge once Template upgrade to Beam 2.56.0

@Abacn Abacn requested a review from a team as a code owner April 11, 2024 20:10
@Abacn Abacn requested review from darshan-sj and shreyakhajanchi and removed request for a team April 11, 2024 20:10
@Abacn Abacn marked this pull request as draft April 11, 2024 20:12
@codecov
Copy link

codecov bot commented Apr 11, 2024

Codecov Report

Attention: Patch coverage is 0% with 13 lines in your changes missing coverage. Please review.

Project coverage is 47.94%. Comparing base (40e1a8b) to head (4362fca).
Report is 52 commits behind head on main.

Files with missing lines Patch % Lines
...google/cloud/teleport/spanner/ImportTransform.java 0.00% 3 Missing ⚠️
...le/cloud/teleport/spanner/TextImportTransform.java 0.00% 2 Missing ⚠️
...d/teleport/templates/common/SpannerConverters.java 0.00% 2 Missing ⚠️
...ogle/cloud/teleport/spanner/ApplyDDLTransform.java 0.00% 1 Missing ⚠️
...google/cloud/teleport/spanner/ExportTransform.java 0.00% 1 Missing ⚠️
...com/google/cloud/teleport/spanner/ReadDialect.java 0.00% 1 Missing ⚠️
.../cloud/teleport/spanner/ReadInformationSchema.java 0.00% 1 Missing ⚠️
...google/cloud/teleport/templates/SpannerToText.java 0.00% 1 Missing ⚠️
...leport/templates/SpannerVectorEmbeddingExport.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1429      +/-   ##
============================================
+ Coverage     46.97%   47.94%   +0.96%     
+ Complexity     4047     4032      -15     
============================================
  Files           876      867       -9     
  Lines         52203    51043    -1160     
  Branches       5502     5381     -121     
============================================
- Hits          24524    24473      -51     
+ Misses        25922    24826    -1096     
+ Partials       1757     1744      -13     
Components Coverage Δ
spanner-templates 68.91% <0.00%> (-0.01%) ⬇️
spanner-import-export 65.68% <0.00%> (-0.02%) ⬇️
spanner-live-forward-migration 76.54% <ø> (ø)
spanner-live-reverse-replication 78.80% <ø> (ø)
spanner-bulk-migration 87.94% <ø> (ø)
Files with missing lines Coverage Δ
...ogle/cloud/teleport/spanner/ApplyDDLTransform.java 0.00% <0.00%> (ø)
...google/cloud/teleport/spanner/ExportTransform.java 16.00% <0.00%> (ø)
...com/google/cloud/teleport/spanner/ReadDialect.java 0.00% <0.00%> (ø)
.../cloud/teleport/spanner/ReadInformationSchema.java 0.00% <0.00%> (ø)
...google/cloud/teleport/templates/SpannerToText.java 0.00% <0.00%> (ø)
...leport/templates/SpannerVectorEmbeddingExport.java 0.00% <0.00%> (ø)
...le/cloud/teleport/spanner/TextImportTransform.java 43.47% <0.00%> (ø)
...d/teleport/templates/common/SpannerConverters.java 75.00% <0.00%> (ø)
...google/cloud/teleport/spanner/ImportTransform.java 22.40% <0.00%> (ø)

... and 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@damccorm
Copy link
Contributor

I want to chime in here: I think this is the right change to make. It dramatically simplifies our infrastructure and it is generally a much better practice to have a single source of truth, rather than have 2 different similar but not quite the same IOs. More urgently, we need to fix the conflict problem mentioned in #1362 - it is not reasonable to expect the release manager to dedicate special time to this IO every release, which is what the current setup requires.

However, I also recognize that this fork was done for a reason, and this change introduces challenges. Waiting on Beam changes to roll out is slower and means it takes fixes longer to roll out. In the short term, merging this would also regress #1392 and #1376. I'm not sure how urgent those are, but it would certainly be a cost. In general, I'd probably prefer we at least merge this alongside the next Beam release/templates upgrade to avoid that problem.

Moving forward, I think we have a few options:

  1. Merge this PR with the next Beam release (preferred IMO)
  2. Find another way to fix [Bug]: LocalSpannerIO package namespace conflict with Beam's SpannerIO  #1362 so that it doesn't require release manager intervention to upgrade the Beam version every time.
  3. Do nothing - IMO this is a bad option though because of the maintenance burden it imposes, and if we do this we'd likely at least require the spanner team to be more involved in Beam version upgrades

@Abacn
Copy link
Contributor Author

Abacn commented Apr 12, 2024

Currently test fail with error

Error message from worker: java.lang.IllegalArgumentException: Unknown spanner type FLOAT32 

needs upcoming Beam release to include apache/beam#30893

@manitgupta
Copy link
Member

I agree that this something we need to get done. This is something we wanted to go to towards the later half of the year. LocalSpannerIO got added in the first place due to Beam's release schedule being slower than the release velocity the Spanner team desires in launching new features to the customers.

@damccorm / @Abacn - How do other teams handle this today? I would like to understand if there is a fundamental reason why Spanner needs to this differently, or if other services do not encounter similar release velocity issues.

Can you please let me know what support is needed from the Spanner team to get this through?

There are some implementation gaps between SpannerIO in Beam and LocalSpannerIO, which need to be covered to ensure we don't create a parity difference between the two.
@darshan-sj Can you please evaluate?

@darshan-sj
Copy link
Contributor

In LocalSpannerAccessor line number - 44 to 56, there are some retry settings which are set for ExecuteStreamingSQL method. These are required for Databoost feature to work properly. They are not present in Apache Beam and need to be moved to SpannerAccessor in Apache Beam.
I would suggest going with Option 2 from @damccorm 's comment for now. Then evaluate the diff properly and remove LocalSpannerIO once the diff is migrated to Beam and that version is consumed.

Regarding Beam's release schedule, @damccorm / @Abacn How do other teams deal with this issue? Apache Beam takes 2+ months for release.

@damccorm
Copy link
Contributor

How do other teams handle this today? I would like to understand if there is a fundamental reason why Spanner needs to this differently, or if other services do not encounter similar release velocity issues.

I think this is somewhat of an unsolved problem. Releasing Beam more often may eventually be the answer, but we're not ready for that yet. Existing approaches generally are:

  1. Something close to what you are doing
  2. Deal with the slow release cycle

In general, IOs tend to be mostly stable for teams, so (2) is acceptable for most. It sounds like that isn't true for your team - what is driving your velocity needs here? (I recognize that faster releases are just generally better, but I'm also trying to understand if there are things beyond that, e.g. a big feature, etc...).

Teams will also sometimes go with option (2), and then temporarily fork the code if an urgent issue comes up

Can you please let me know what support is needed from the Spanner team to get this through?

I think we need to drive one of the options above - either:

  1. Merge this PR with the next Beam release - this would also require fixing the gaps @darshan-sj called out in Beam head (which I think we should do no matter what)

or

  1. Find another way to fix [Bug]: LocalSpannerIO package namespace conflict with Beam's SpannerIO  #1362 so that it doesn't require release manager intervention to upgrade the Beam version every time.

I'm actually fine with either - I think (1) is less painful long term, but I'm ok with your team making that tradeoff if you want to go with option (2). If you decide to go that route, I'd ask your team to drive that fix forward.


In the very short term: lets make sure that the local Spanner IO is totally in sync with the one at Beam head so that the next release upgrade is painless.

@Abacn
Copy link
Contributor Author

Abacn commented Apr 15, 2024

Note that the real issue here is #1362 - LocalSpannerIO is under org.apache.beam.io.spanner namespace. This has two issues

It is fine to add some logic in DataflowTemplate to make use of its faster release schedule than Beam. Actually this is one of the advantage of the Template. However, the code should under "com.google.cloud.teleport" namespace

I kindly remember that originally it used the same namespace of Beam SpannerIO because some modification is needed to access package private methods of the SpannerIO. It is not a valid workaround by hacking java access modifiers. If needed, we can have a LocalSpannerIO, as a wrapper of SpannerIO, but it needs to be designed in a way that work under Templates' own namespace.

@manitgupta
Copy link
Member

Thank you @damccorm and @Abacn for your points. I largely agree with the points you have mentioned here. Specifically -

I'm actually fine with either - I think (1) is less painful long term, but I'm ok with your team making that tradeoff if you want to go with option (2). If you decide to go that route, I'd ask your team to drive that fix forward.

Option 2 is -

Find another way to fix [Bug]: LocalSpannerIO package namespace conflict with Beam's SpannerIO #1362 so that it doesn't require release manager intervention to upgrade the Beam version every time.

One of the possible solutions is suggested by @Abacn -

It is fine to add some logic in DataflowTemplate to make use of its faster release schedule than Beam. Actually this is one of the advantage of the Template. However, the code should under "com.google.cloud.teleport" namespace

This looks like a reasonable approach, but will require some investment to make the changes and fix the namespace conflicts as well as redesign LocalSpannerIO in a way that it can extend SpannerIO.

I am not yet sure what approach we should take here - it seems to me that Spanner needs to have an internal discussion with its feature teams to align on the following -

  1. Why do Spanner features need to go in more frequently compared to other services? What do we lose if we wait (Option 1)?
  2. Is there a technical solve (similar to one suggested by @Abacn) that can solve this problem without taking a hit on the release velocity?

I will discuss this internally and get back to you. I like option 2 but I don't think it falls under the "short term" category.


For the very short term -

In the very short term: lets make sure that the local Spanner IO is totally in sync with the one at Beam head so that the next release upgrade is painless.

@darshan-sj - Can you do the following -

  1. Evaluate the diff between SpannerIO and LocalSpannerIO. If the diff is only the retry settings which are added to LocalSpannerIO, go ahead and create the PR on beam so that they are both in sync.
  2. If there is additional differences identified, reach out to the Spanner internal feature team that created the diff in LocalSpannerIO and ask them to take up this work.

@damccorm
Copy link
Contributor

Thanks @manitgupta this plan SGTM

@Abacn Abacn force-pushed the removelocalspannerio branch from 253a4e9 to db2d527 Compare May 15, 2024 18:20
@Abacn Abacn marked this pull request as ready for review May 16, 2024 20:25
@Abacn
Copy link
Contributor Author

Abacn commented May 16, 2024

As dataflow template now upgraded to Beam 2.56.0, the upstream Beam now also included the newer Spanner API

https://github.com/GoogleCloudPlatform/DataflowTemplates/commits/main/v1/src/main/java/org/apache/beam/sdk/io/gcp/spanner hasn't been changed since Apr 10 (the latest commit was for Upgrade Beam version to 2.55.1) so we are save to move forward

@Abacn
Copy link
Contributor Author

Abacn commented May 16, 2024

R: @manitgupta @damccorm

@damccorm
Copy link
Contributor

Technically this looks good to me, and I agree we need to do something here. I think it is up to the Spanner folks on whether they want to move forward with this or a different direction. I'll note that my strong recommendation would be to unfork because of the tech debt forking inevitably leads to.

I will discuss this internally and get back to you. I like option 2 but I don't think it falls under the "short term" category.

@manitgupta what was the outcome of this conversation?

@github-actions
Copy link
Contributor

This pull request has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time. Thank you for your contributions.

@github-actions github-actions bot added the stale label Nov 14, 2024
@Abacn Abacn mentioned this pull request Nov 15, 2024
@github-actions
Copy link
Contributor

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Nov 21, 2024
@Abacn Abacn reopened this Feb 13, 2025
@Abacn Abacn force-pushed the removelocalspannerio branch from db2d527 to ee0eeed Compare February 13, 2025 18:29
@stale stale bot removed stale labels Feb 13, 2025
@Abacn Abacn force-pushed the removelocalspannerio branch from ee0eeed to fafb60b Compare February 13, 2025 18:35
@Abacn
Copy link
Contributor Author

Abacn commented Feb 13, 2025

It's been a year and Beam release still get affected by conflict LocalSpannerIO package. Let's proceed with this change

@Abacn
Copy link
Contributor Author

Abacn commented Feb 13, 2025

R: @manitgupta @damccorm

@Abacn Abacn added the improvement Making existing code better label Feb 13, 2025
@Abacn Abacn changed the title Remove LocalSpannerIO Remove LocalSpannerIO and uses Beam's SpannerIO for Spanner Templates Feb 13, 2025
@damccorm
Copy link
Contributor

I agree. @manitgupta @darshan-sj could we please either take this PR or commit to fixing this problem differently in the next few weeks? This is causing continuous pain with Beam releases.

@damccorm
Copy link
Contributor

Hey @manitgupta @darshan-sj any thoughts here or objections to merging this PR?

@manitgupta
Copy link
Member

I think we definitely need to do this, and it has pending for a while as well. We need to verify all the changes made locally to LocalSpannerIO have made its way to SpannerIO in beam.

@darshan-sj Can we verify this?

@darshan-sj
Copy link
Contributor

Changes in #2144 and #2145 are pending to be moved to Apache beam. I have asked these PR authors to raise PRs in Apache Beam.

@Abacn
Copy link
Contributor Author

Abacn commented Feb 20, 2025

#2145 sync to Beam as apache/beam#34034

#2144 opened apache/beam#34043

@darshan-sj
Copy link
Contributor

Raised #2231 to move SpannerIO to different namespace

@darshan-sj
Copy link
Contributor

Short term fix is merged in #2231 . Closing this PR.

@darshan-sj darshan-sj closed this Mar 17, 2025
@Abacn Abacn deleted the removelocalspannerio branch January 6, 2026 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Making existing code better size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: LocalSpannerIO package namespace conflict with Beam's SpannerIO

4 participants