Skip to content

Conversation

@kou
Copy link
Member

@kou kou commented Sep 22, 2022

No description provided.

@kou
Copy link
Member Author

kou commented Sep 22, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 22, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 23, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 24, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 24, 2022

@github-actions crossbow submit java-jars

@github-actions

This comment was marked as outdated.

@kou
Copy link
Member Author

kou commented Sep 26, 2022

Ah, we don't need .lib. I'll remove them.

@kou
Copy link
Member Author

kou commented Sep 26, 2022

@github-actions crossbow submit java-jars

@github-actions
Copy link

Revision: 092d458

Submitted crossbow builds: ursacomputing/crossbow @ actions-ae8c93c847

Task Status
java-jars Github Actions

@davisusanibar
Copy link
Contributor

Tested on Windows 10 Home:

1.- Download new jar Dataset / C Data locally from https://github.com/ursacomputing/crossbow/releases/tag/actions-9be4b55dea-github-java-jars

2.- Test new DLL created:

# Dataset DLL
$ cygcheck.exe 'arrow_dataset_jni.dll'
  C:\Windows\system32\WINHTTP.dll
    C:\Windows\system32\ntdll.dll
    C:\Windows\system32\KERNELBASE.dll
  C:\Windows\system32\bcrypt.dll
  C:\Windows\system32\WININET.dll
    C:\Windows\system32\msvcrt.dll
  C:\Windows\system32\USERENV.dll
    C:\Windows\system32\RPCRT4.dll
  C:\Windows\system32\VERSION.dll
    C:\Windows\system32\KERNEL32.dll
  C:\Windows\system32\WS2_32.dll
  C:\Windows\system32\SHELL32.dll
    C:\Windows\system32\msvcp_win.dll
    C:\Windows\system32\USER32.dll
      C:\Windows\system32\win32u.dll
      C:\Windows\system32\GDI32.dll
  C:\Windows\system32\ole32.dll
    C:\Windows\system32\combase.dll
  C:\Windows\system32\ADVAPI32.dll
    C:\Windows\system32\SECHOST.dll
  C:\Windows\system32\MSVCP140.dll
    C:\Windows\system32\VCRUNTIME140.dll
    C:\Windows\system32\VCRUNTIME140_1.dll

# C Data Interface DLL
$ cygcheck.exe 'arrow_cdata_jni.dll'
  C:\Windows\system32\MSVCP140.dll
    C:\Windows\system32\VCRUNTIME140.dll
      C:\Windows\system32\KERNEL32.dll
        C:\Windows\system32\ntdll.dll
        C:\Windows\system32\KERNELBASE.dll
    C:\Windows\system32\VCRUNTIME140_1.dll

If you see errors try to install https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170

3.- Install new jar Dataset / C Data locally:

# intall dataset manually
mvn install:install-file -Dfile="C:\Users\dsusanibar\IdeaProjects\win-cookbooks\src\main\resources\files\arrow-dataset-10.0.0-SNAPSHOT.pom" -DgroupId="org.apache.arrow" -DartifactId="arrow-dataset" -Dversion="10.0.0-SNAPSHOT" -Dpackaging="pom"
mvn install:install-file -Dfile="C:\Users\dsusanibar\IdeaProjects\win-cookbooks\src\main\resources\files\arrow-dataset-10.0.0-SNAPSHOT.jar" -DgroupId="org.apache.arrow" -DartifactId="arrow-dataset" -Dversion="10.0.0-SNAPSHOT" -Dpackaging="jar"
# install c data interface manually
mvn install:install-file -Dfile="C:\Users\dsusanibar\IdeaProjects\win-cookbooks\src\main\resources\files\arrow-c-data-10.0.0-SNAPSHOT.pom" -DgroupId="org.apache.arrow" -DartifactId="arrow-c-data" -Dversion="10.0.0-SNAPSHOT" -Dpackaging="pom"
mvn install:install-file -Dfile="C:\Users\dsusanibar\IdeaProjects\win-cookbooks\src\main\resources\files\arrow-c-data-10.0.0-SNAPSHOT.jar" -DgroupId="org.apache.arrow" -DartifactId="arrow-c-data" -Dversion="10.0.0-SNAPSHOT" -Dpackaging="jar"

4.- Add new Dataset / C Data Interface dependencies into your project (Maven/Gradle)

5.- Create Dataset with mew Dataset jar that contains DLL arrow_dataset_jni.dll + Read RecordBatches with new C Data Interface that contains DLL arrow_cdata_jni.dll:

import org.apache.arrow.dataset.file.FileFormat;
import org.apache.arrow.dataset.file.FileSystemDatasetFactory;
import org.apache.arrow.dataset.jni.NativeMemoryPool;
import org.apache.arrow.dataset.scanner.ScanOptions;
import org.apache.arrow.dataset.scanner.Scanner;
import org.apache.arrow.dataset.source.Dataset;
import org.apache.arrow.dataset.source.DatasetFactory;
import org.apache.arrow.memory.BufferAllocator;
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.VectorSchemaRoot;
import org.apache.arrow.vector.ipc.ArrowReader;

import java.io.IOException;
import java.net.URISyntaxException;

public class Recipe {
    public static void main(String[] args) throws URISyntaxException {
        // File at: https://github.com/apache/arrow-cookbook/blob/main/java/thirdpartydeps/parquetfiles/data1.parquet
        String uri = "file:///C:\\Users\\dsusanibar\\IdeaProjects\\win-cookbooks\\src\\main\\resources\\files\\data1.parquet";
        ScanOptions options = new ScanOptions(/*batchSize*/ 5);
        try (
            BufferAllocator allocator = new RootAllocator();
            DatasetFactory datasetFactory = new FileSystemDatasetFactory(allocator, NativeMemoryPool.getDefault(), FileFormat.PARQUET, uri);
            Dataset dataset = datasetFactory.finish();
            Scanner scanner = dataset.newScan(options)
        ) {
            scanner.scan().forEach(scanTask -> {
                try (ArrowReader reader = scanTask.execute()) {
                    while (reader.loadNextBatch()) {
                        final int[] count = {1};
                        try (VectorSchemaRoot root = reader.getVectorSchemaRoot()) {
                            System.out.println("Number of rows per batch["+ count[0]++ +"]: " + root.getRowCount());
                            System.out.println(root.contentToTSVString());
                        }
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                }
            });
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Result:
Number of rows per batch[1]: 3
id	name
1	David
2	Gladis
3	Juan

Thanks a lot @kou

@kou
Copy link
Member Author

kou commented Sep 28, 2022

That's good to know. :-)
I merge this.

@kou kou merged commit 35bfeb4 into apache:master Sep 28, 2022
@kou kou deleted the java-jni-windows branch September 28, 2022 03:14
@ursabot
Copy link

ursabot commented Sep 28, 2022

Benchmark runs are scheduled for baseline = f3af96a and contender = 35bfeb4. 35bfeb4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.14% ⬆️0.0%] test-mac-arm
[Failed ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.04% ⬆️0.04%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 35bfeb41 ec2-t3-xlarge-us-east-2
[Failed] 35bfeb41 test-mac-arm
[Failed] 35bfeb41 ursa-i9-9960x
[Finished] 35bfeb41 ursa-thinkcentre-m75q
[Finished] f3af96a2 ec2-t3-xlarge-us-east-2
[Failed] f3af96a2 test-mac-arm
[Failed] f3af96a2 ursa-i9-9960x
[Finished] f3af96a2 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants