databricks_utils.aws¶
Description
Utility classes to interface with AWS for databricks notebooks.
-
class
databricks_utils.aws.
S3Bucket
(bucketname, aws_access_key, aws_secret_key, dbutils=None)¶ Bases:
object
Class to wrap around a S3 bucket and mount at databricks fs.
Parameters: - bucketname – name of the S3 bucket
- aws_access_key – AWS access key
- aws_secret_key – AWS secret key
- dbutils – databricks dbutils (not needed if S3Bucket.attach_dbutils has been called)
-
allow_spark
(spark_context)¶ Update spark context hadoop config with AWS access information so that databricks spark can access the S3 bucket.
Parameters: spark_context – databricks spark context
-
classmethod
attach_dbutils
(dbutils)¶ Attach databricks dbutils to S3Bucket. You MUST attach this before S3Bucket can be used.
Parameters: dbutils – databricks dbutils (https://docs.databricks.com/user-guide/dev-tools/dbutils.html#dbutils)
-
local
(path)¶ Return the absolute path to the corresponding resource in dbfs.
Parameters: path – relative path to a resource in the s3 bucket.
-
ls
(path='', display=None)¶ List the files and folders in s3 bucket mounted in dbfs.
Parameters: - path – path relative to the s3 bucket
- display – a Callable to render the HTML output. e.g. displayHTML
-
mount
(mount_pt, dbutils=None)¶ Mounts the S3 bucket in dbfs. environment variables AWS_ACCESS_KEY and AWS_SECRET_KEY must be set.
Parameters: - mount_pt – Where to mount the S3 bucket in the dbfs.
- display – Callable to display
- dbutils – dbutils module
-
s3
(path)¶ Return the path to the corresponding resource in the s3 bucket that is interpretable by the databricks spark worker.
Parameters: path – relative path to a resource in the s3 bucket.
-
umount
(dbutils)¶ umount the s3 bucket.
Classes
S3Bucket (bucketname, aws_access_key, …[, …]) |
Class to wrap around a S3 bucket and mount at databricks fs. |