##PowerShell helper functions for Azure HDInsight
This is a collection of helper functions I've created for working with HDInsight. Currently it provides functions for uploading (add,) downloading (get,) listing (find,) and deleting (remove,) operations on files in the HDInsight cluster primary storage account.
The primary storage account for an HDInsight cluster is really just an Azure blob, so these helpers are really just using the Azure PowerShell cmdlets for blob operations. But, they query your HDInsight cluster to figure out the storage to use, so you only have to know the cluster name for these operations.
As I encounter other common tasks that can benefit from helper functions, I'll add them to this project.
##Contributing
If you have things you'd like to see in this collection, add an issue using the issues link. If you have something you'd like to contribute, fork this repository, write the code, test it, then send a pull request.
##Why do this?
Because actions like uploading and downloading files are things you do every day, so it makes sense to encapsulate them in a nice function, rather than writing 5+ lines of code everytime you write a new script to upload something, run a job, and download the results.
##Installing
-
Download the
hdinsight-tools.psm1. -
From PowerShell, use
import-module hdinsight-tools.psm1to load the module.
##Using
You must have an active Azure subscription to use this. You must also have an active subscription associated with the Azure PowerShell (Add-AzureAccount or Import-AzurePublishSettingsFile). The HDInsight-Tools will bark at you if you don't have these.
###Working with files
-
Add-HDInsightFile - Adds a single file from the local file system (or networked,) to the primary storage container for your HDInsight cluster
-
Remove-HDInsightFile - Removes a single file from the primary storage container for your HDInsight cluster
-
Get-HDInsightFile - Gets a single file from the primary storage container for your HDInsight cluster
-
Find-HDInsightFile - Lists file(s) from the primary storage container for your HDInsight cluster
-
Get-HDInsightStorage - Lists the storage account(s) associated with your HDInsight cluster. Mostly useful if you are going to use some other tool like AzCopy
NOTE: This currently only works on one file at a time. Mainly because it depends on using a context generated by New-AzureStorageContext, and this doesn't serialize/deserialize when using workflows, which you have to use with foreach -parallel. If you have a solution to this to allow parallel file operations using a file context, please let me know. Otherwise, use Get-HDInsightStorage to find your storage containers, then use something like AzCopy to perform bulk copy operations.