Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address main scalastyle errors - #196 #248

Merged
merged 6 commits into from
Aug 1, 2018
Merged

Address main scalastyle errors - #196 #248

merged 6 commits into from
Aug 1, 2018

Conversation

greebie
Copy link
Contributor

@greebie greebie commented Jul 31, 2018

GitHub issue(s):

#196 , also see #212 in discussion below

What does this Pull Request do?

Provides main fixes for scalastyle where the changes do not require major code refactoring.

  • all Magic Numbers provide vals
  • cases of duplicate "strings" (usually in test) provided with vals
  • public methods have explicit Types provided
  • general linting errors (missing spaces etc.)
  • tried to resolve underscore.import errors, or ignore them where it makes sense to (e.g. main aut classes or sparksql.functions._

How should this be tested?

Mostly mvn scalastyle:check but it should also pass Travis and tests.

Additional Notes:

I have ignored any scalastyle error where I would have to refactor code to make it work.
"Avoid null" errors were the most common. I have created an issue in #212 about how to resolve these (if we indeed wish to). Looking at the code, I think it's possible that resolving these could help us with a few null Pointer errors. But I'd like to address them in a different PR, because it involves different approaches to some conditionals and may require documentation changes.

Interested parties

@ruebot

@codecov
Copy link

codecov bot commented Jul 31, 2018

Codecov Report

Merging #248 into master will increase coverage by 0.37%.
The diff coverage is 73.62%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #248      +/-   ##
==========================================
+ Coverage   70.57%   70.94%   +0.37%     
==========================================
  Files          41       41              
  Lines         982     1005      +23     
  Branches      179      180       +1     
==========================================
+ Hits          693      713      +20     
- Misses        232      235       +3     
  Partials       57       57
Impacted Files Coverage Δ
...scala/io/archivesunleashed/app/ExtractGraphX.scala 92.1% <ø> (ø) ⬆️
...io/archivesunleashed/matchbox/TupleFormatter.scala 57.14% <ø> (ø) ⬆️
...ala/io/archivesunleashed/matchbox/ComputeMD5.scala 100% <ø> (ø) ⬆️
src/main/scala/io/archivesunleashed/package.scala 84.11% <ø> (ø) ⬆️
...scala/io/archivesunleashed/app/WriteGraphXML.scala 100% <ø> (ø) ⬆️
.../io/archivesunleashed/app/PlainTextExtractor.scala 100% <ø> (ø) ⬆️
...ala/io/archivesunleashed/matchbox/RemoveHTML.scala 100% <ø> (ø) ⬆️
.../scala/io/archivesunleashed/app/ExtractGraph.scala 0% <ø> (ø) ⬆️
...c/main/scala/io/archivesunleashed/df/package.scala 90.47% <ø> (ø) ⬆️
...o/archivesunleashed/app/ExtractPopularImages.scala 100% <ø> (ø) ⬆️
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update afe9254...af65a99. Read the comment docs.

Copy link
Member

@ruebot ruebot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

Couple fixes need to happen, as well as some questions answered.

throw new IllegalArgumentException()
case other => throw other
case other: Any => throw other

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this blank line?

import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this blank line?

@@ -45,7 +47,6 @@ object ExtractPopularImages {
.reduceByKey((image1, image2) => (image1._1, image1._2, image1._3 + image2._3))
.map(x=> (x._2._3, x._2._2))
.takeOrdered(limit)(Ordering[Int].on(x => -x._1))
res.foreach(x => println(x._1 + "\t" + x._2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this being removed? Is it just printing to a stdout?

Copy link
Contributor Author

@greebie greebie Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - it was just printing.
Scalastyle rejects all println.
I can restore it and use a // scalastyle:off if it's necessary, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if @ianmilligan1 can speak to usefulness of a popular images printout here if we already have the RDD of images available (i.e. they can be mapped through and printed out in scripts if needed)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran it on both master and your branch, @greebie, and yes - probably that println was in there for debugging. The saveAsTextFile output is identical to the printed lines, so we were getting them twice. I think this is a good alteration.

@@ -26,7 +26,7 @@ import java.util
import scala.collection.mutable
Copy link
Member

@ruebot ruebot Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to change the name of this file as well, if you're changing it to NERClassifier. Make sure you do a git mv foo bar, otherwise you'll lose the history on the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix. @ianmilligan1 I think this should be tested in case I broke something. There are no working unit tests for this I do not think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works well (updated the docs which were outdated, actually).

@@ -22,7 +22,7 @@ import java.nio.charset.StandardCharsets
import org.apache.commons.codec.binary.Hex


/** Information about an image. e.g. width, height*/
/** Information about an image. e.g. width, height */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put a full stop after height.

@@ -26,6 +28,7 @@ object TweetUtils {
* @param tweet JValue/JSON object containing Twitter API data (JSON)
*/
implicit class JsonTweet(tweet: JValue) {
val userBranch = "user"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you speak to why userBranch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

userBranch refers to user as a branch of tweet json. The main error that wants to be fixed here is multiple occurrences of "user." Next commit will just use user as I agree that's less confusing.

@@ -29,10 +29,10 @@ class ExtractBoilerPipeTextTest extends FunSuite {
<footer>Copyright 2017</footer>"""
var boiler = """Copyright 2017"""

test("Collects boilerpip") {
test("Collects boilerpiep") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@ianmilligan1
Copy link
Member

Ok, let me test NER and look into images - should be able to do so tomorrow!

@ruebot ruebot merged commit 77dbd51 into master Aug 1, 2018
@ruebot ruebot deleted the issue-196 branch August 1, 2018 13:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants