Closed
Bug 1472810
Opened 7 years ago
Closed 7 years ago
Investigate use of hadoop s3a:// filesystem rather than s3://
Categories
(Data Platform and Tools :: General, enhancement)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: klukas, Unassigned)
Details
(Whiteboard: [DataPlatform])
The hadoop-aws library contains multiple implementations of an S3 filesystem: s3, s3n, and s3a [0]. It appears that all our s3 URLs in telemetry-batch-view and other projects are using the s3:// prefix, which the hadoop-aws docs indicate is deprecated.
We should investigate moving all s3 URLs to s3a, which is the actively maintained filesystem and likely offers some amount of performance improvement.
[0] https://hadoop.apache.org/docs/r2.8.0/hadoop-aws/tools/hadoop-aws/index.html#S3
Reporter | ||
Updated•7 years ago
|
Whiteboard: [DataPlatform]
Comment 1•7 years ago
|
||
in emr s3:// is the right thing to use, as per https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
Assignee | ||
Updated•3 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•