Skip to content

Commit

Permalink
Use singleton per-region for S3Client (facebookresearch#384)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: facebookresearch#384

# What
* Use a singleton for each region when constructing our S3Client instead of a _single_ singleton
* This is kind of a follow-up to D36727729 (facebookresearch@47182c6)
# Why
* Follow-up to S281873 - S3Client singleton breaks multi-region support in PCF IO

NOTE: PCF should be owning this code in the long-term since it's out of PSI's scope, but since we owned the initial SEV, I took ownership of this follow-up.

Differential Revision: D39174441

fbshipit-source-id: 73dd6d6c88e216da3f99573689ef4c4eaa7d16ed
  • Loading branch information
Logan Gore authored and facebook-github-bot committed Sep 1, 2022
1 parent 6fa554d commit 7854242
Showing 1 changed file with 36 additions and 2 deletions.
38 changes: 36 additions & 2 deletions fbpcf/io/cloud_util/S3Client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,46 @@

#include "fbpcf/io/cloud_util/S3Client.h"

#include <string>

#include <aws/core/Aws.h>
#include <aws/s3/S3Client.h>

#include <folly/container/F14Map.h>
#include <folly/Synchronized.h>

namespace fbpcf::cloudio {
S3Client& S3Client::getInstance(const fbpcf::aws::S3ClientOption& option) {
static S3Client s3Client(option);
return s3Client;
/* Due to previous problems, we create a Singleton instance of the S3Client,
* but there's a catch: we need a distinct S3Client for each region, or we
* run into other issues. For that reason, we store this map from string to
* S3Client with the assumption that the keys are region names. Since region
* is optional, we also allow for a default empty string region.
* ***************************** NOT THREAD SAFE ****************************
* NOTE: Significant refactoring is required to make this thread safe
* Downstream usage wants a mutable reference, but a folly::Synchronized
* RWLock will return a const ref to a reader, meaning it's hard to refactor.
* Simply trying to use folly::Synchronized around the map isn't sufficient,
* because we'll leak a reference to an object in the map which is unsafe.
* ***************************** NOT THREAD SAFE ****************************
*/
static folly::Synchronized<folly::F14FastMap<std::string, S3Client>> m;

std::string defaultStr{};
auto region = option.region.value_or(defaultStr);

m.withWLock([&](auto& clientMap) {
if (clientMap.find(region) == clientMap.end()) {
clientMap.at(region) = S3Client{option};
}
});

/* You may see this and think, "Hey, the NOT THREAD SAFE warning above is
* outdated, it looks like we fixed it!", but you're wrong. This still does
* not fully solve the problem. Because the downstream consumer takes a
* mutable reference, there's no guarantee that this is thread safe. It's
* better than nothing, but you still shouldn't fully trust this code.
*/
return m.wlock()->at(region);
}
} // namespace fbpcf::cloudio

0 comments on commit 7854242

Please sign in to comment.