-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-1134] Fix and document passing of arguments to IPython #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…only call ipython if no command line arguments were supplied
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Well, my perspective is always that of a new easily confused user. (That's Your doc change explicitly recommends I not do that, but...well, really? In the new Spark class I'm working on, which uses mainly Python, I had I think what's going to happen is that users will ignore your admonishment I can live with it as is (with the doc change) but it isn't a very user On Wed, Apr 2, 2014 at 12:31 AM, UCB AMPLab notifications@github.comwrote:
|
I see, in that case, I think we can do the following:
|
BTW I've updated it now to be just your initial commit, but without attempting to remove IPYTHON_OPTS. I think this is the best solution. |
Merged build triggered. |
Merged build started. |
Note that the new version of IPython released today has my fix in it, so doing |
Merged build finished. All automated tests passed. |
All automated tests passed. |
@bouk that's great, thanks, but we probably can't have it be the default for a while until more people update their IPython. |
@dianacarroll I've merged this in now, using just your original commit (mateiz@747bb13). I think that's the best solution for now. Thanks for the feedback! |
This is based on @dianacarroll's previous pull request #227, and @JoshRosen's comments on #38. Since we do want to allow passing arguments to IPython, this does the following: * It documents that IPython can't be used with standalone jobs for now. (Later versions of IPython will deal with PYTHONSTARTUP properly and enable this, see ipython/ipython#5226, but no released version has that fix.) * If you run `pyspark` with `IPYTHON=1`, it passes your command-line arguments to it. This way you can do stuff like `IPYTHON=1 bin/pyspark notebook`. * The old `IPYTHON_OPTS` remains, but I've removed it from the documentation. This is in case people read an old tutorial that uses it. This is not a perfect solution and I'd also be okay with keeping things as they are today (ignoring `$@` for IPython and using IPYTHON_OPTS), and only doing the doc change. With this change though, when IPython fixes ipython/ipython#5226, people will immediately be able to do `IPYTHON=1 bin/pyspark myscript.py` to run a standalone script and get all the benefits of running scripts in IPython (presumably better debugging and such). Without it, there will be no way to run scripts in IPython. @JoshRosen you should probably take the final call on this. Author: Diana Carroll <dcarroll@cloudera.com> Closes #294 from mateiz/spark-1134 and squashes the following commits: 747bb13 [Diana Carroll] SPARK-1134 bug with ipython prevents non-interactive use with spark; only call ipython if no command line arguments were supplied (cherry picked from commit a599e43) Signed-off-by: Matei Zaharia <matei@databricks.com>
Bug fixes for updating the RDD block's memory and disk usage information Bug fixes for updating the RDD block's memory and disk usage information. From the code context, we can find that the memSize and diskSize here are both always equal to the size of the block. Actually, they never be zero. Thus, the logic here is wrong for recording the block usage in BlockStatus, especially for the blocks which are dropped from memory to ensure space for the new input rdd blocks. I have tested it that this would cause the storage metrics shown in the Storage webpage wrong and misleading. With this patch, the metrics will be okay. Finally, Merry Christmas, guys:)
This is based on @dianacarroll's previous pull request apache#227, and @JoshRosen's comments on apache#38. Since we do want to allow passing arguments to IPython, this does the following: * It documents that IPython can't be used with standalone jobs for now. (Later versions of IPython will deal with PYTHONSTARTUP properly and enable this, see ipython/ipython#5226, but no released version has that fix.) * If you run `pyspark` with `IPYTHON=1`, it passes your command-line arguments to it. This way you can do stuff like `IPYTHON=1 bin/pyspark notebook`. * The old `IPYTHON_OPTS` remains, but I've removed it from the documentation. This is in case people read an old tutorial that uses it. This is not a perfect solution and I'd also be okay with keeping things as they are today (ignoring `$@` for IPython and using IPYTHON_OPTS), and only doing the doc change. With this change though, when IPython fixes ipython/ipython#5226, people will immediately be able to do `IPYTHON=1 bin/pyspark myscript.py` to run a standalone script and get all the benefits of running scripts in IPython (presumably better debugging and such). Without it, there will be no way to run scripts in IPython. @JoshRosen you should probably take the final call on this. Author: Diana Carroll <dcarroll@cloudera.com> Closes apache#294 from mateiz/spark-1134 and squashes the following commits: 747bb13 [Diana Carroll] SPARK-1134 bug with ipython prevents non-interactive use with spark; only call ipython if no command line arguments were supplied
* Added files should be in the working directories. * Revert unintentional changes * Fix test
* Added files should be in the working directories. * Revert unintentional changes * Fix test
Co-authored-by: Zhiting Guo <zhiting.guo@kyligence.io>
This is based on @dianacarroll's previous pull request #227, and @JoshRosen's comments on #38. Since we do want to allow passing arguments to IPython, this does the following:
pyspark
withIPYTHON=1
, it passes your command-line arguments to it. This way you can do stuff likeIPYTHON=1 bin/pyspark notebook
.IPYTHON_OPTS
remains, but I've removed it from the documentation. This is in case people read an old tutorial that uses it.This is not a perfect solution and I'd also be okay with keeping things as they are today (ignoring
$@
for IPython and using IPYTHON_OPTS), and only doing the doc change. With this change though, when IPython fixes ipython/ipython#5226, people will immediately be able to doIPYTHON=1 bin/pyspark myscript.py
to run a standalone script and get all the benefits of running scripts in IPython (presumably better debugging and such). Without it, there will be no way to run scripts in IPython.@JoshRosen you should probably take the final call on this.