Closed Bug 1472810 Opened 7 years ago Closed 7 years ago

Investigate use of hadoop s3a:// filesystem rather than s3://

Categories

(Data Platform and Tools :: General, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: klukas, Unassigned)

Details

(Whiteboard: [DataPlatform])

The hadoop-aws library contains multiple implementations of an S3 filesystem: s3, s3n, and s3a [0]. It appears that all our s3 URLs in telemetry-batch-view and other projects are using the s3:// prefix, which the hadoop-aws docs indicate is deprecated. We should investigate moving all s3 URLs to s3a, which is the actively maintained filesystem and likely offers some amount of performance improvement. [0] https://hadoop.apache.org/docs/r2.8.0/hadoop-aws/tools/hadoop-aws/index.html#S3
Whiteboard: [DataPlatform]
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.