Can you please help me understand how to insert entities into multiple tables efficiently?
I have 3 tables and 3 entities accordingly. Pricebook has an array of SKUs, and each SKU has 1 price. Basically what I want is to insert multiple entities transactionally, in case of constraint I must update entities in a chain.
The problem occurred once I try to insert into DB more than 1 Pricebook in parallel, so I'm actually catching a PostgreSQL deadlock. I found a workaround is to insert them one by one putting them in the queue but I understand that it's not a great idea.
This might be a stupid question and has been already answered before, but I hope someone could give me a hint.
@Entity
@NoArgsConstructor
@AllArgsConstructor
@Table(name = "pricebook")
public class Pricebook {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
//....
}
@Entity
@NoArgsConstructor
@AllArgsConstructor
@Table(name = "sku")
public class Sku {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
//....
}
@Entity
@NoArgsConstructor
@AllArgsConstructor
@Table(name = "price")
public class Price {
@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private Long id;
@JoinColumn(name = "pricebook_id", referencedColumnName = "id", unique = true)
@ManyToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private Pricebook pricebook;
@JoinColumn(name = "sku_id", referencedColumnName = "id", unique = true)
@ManyToOne(fetch = FetchType.LAZY, cascade = CascadeType.ALL)
private Sku sku;
//....
}
Here is PricebookService logic of an upsert.
@NonNull
@Transactional
public Pricebook createPricebook(@NonNull CreatePricebookRequest request) {
final Instant startDate = PricebookConverter.toDate(request.getStartDate());
final Instant expirationDate = PricebookConverter.toDate(request.getExpirationDate());
if (startDate.isAfter(expirationDate)) {
throw new InvalidParametersException("The pricebook's start date later then its expiration date.");
}
final Region region = regionService.findRegionByName(request.getRegion());
final Optional<Pricebook> isPricebookFound =
pricebookRepository.findByRegionAndPricebookTypeAndStartDateAndExpirationDate(region,
request.getPricebookName(), startDate, expirationDate);
final Pricebook savedOrUpdatedPricebook;
if (isPricebookFound.isPresent()) {
final Pricebook foundPricebook = isPricebookFound.get();
savedOrUpdatedPricebook = pricebookRepository.save(
new Pricebook(foundPricebook.getPricebookId(), request.getName(), foundPricebook.getPricebookName(), foundPricebook.getRegion(), foundPricebook.getStartDate(),
foundPricebook.getExpirationDate());
logger.info("pricebook is updated successfully, pricebook={}", savedOrUpdatedPricebook);
} else {
savedOrUpdatedPricebook = pricebookRepository.save(
new Pricebook(request.getName(), request.getPricebookType(), region, startDate, expirationDate);
logger.info("pricebook is created successfully, pricebook={}", savedOrUpdatedPricebook);
}
final List<Sku> skus = skuService.createSku(savedOrUpdatedPricebook, request.getSkus());
logger.debug("skus are saved successfully, skus={}", skus);
return savedOrUpdatedPricebook;
}
Here is SkuService logic of an upsert. skuToCreateOrUpdate
basically is just a method wich wraps logic if it's found or new and return a new object.
@NonNull
public List<Sku> createSku(@NonNull Pricebook pricebook, @NonNull List<CreateSkuRequest> skus) {
return skus.stream().map(sku -> {
final Optional<Sku> foundSku = skuRepository.findByCode(sku.getCode());
final Sku savedOrUpdatedSku = skuRepository.save(skuToCreateOrUpdate(sku, foundSku.map(Sku::getSkuId).orElse(null)));
final List<Price> prices = priceService.createPrices(pricebook, savedOrUpdatedSku, sku.getPrice());
logger.debug("prices are saved successfully, prices={}", prices);
return savedOrUpdatedSku;
}).collect(toList());
}
Here is PriceService logic of an upsert.
@NonNull
public List<Price> createPrices(@NonNull Pricebook pricebook, @NonNull Sku sku, @NonNull CreatePriceRequest price) {
final Optional<Price> foundPrice = priceRepository.findByPricebookAndSku(pricebook, sku);
final Price savedOrUpdatedPrice;
if (foundPrice.isPresent()) {
final Price priceToUpdate = foundPrice.get();
savedOrUpdatedPrice = priceRepository.save(
new Price(priceToUpdate.getPriceId(),
pricebook,
sku);
logger.info("price is updated successfully, price={}", savedOrUpdatedPrice);
} else {
savedOrUpdatedPrice = priceRepository.save(
new Price(pricebook, sku);
logger.info("price is created successfully, price={}", savedOrUpdatedPrice);
}
return Collections.singletonList(savedOrUpdatedPrice);
}
I'm using a JpaRepository all over the places. Like so..
@Repository
public interface PricebookRepository extends JpaRepository<Pricebook, Long> {}
@Repository
public interface SkuRepository extends JpaRepository<Sku, Long> {}
@Repository
public interface PriceRepository extends JpaRepository<Price, Long> {}
I believe you might be facing this issue, which is pretty likely especially if both transactions attempt to insert the same SKUs.
If that's the case, I can think of two ways to mitigate it:
Partial solution: try sorting the SKUs in List<CreateSkuRequest> skus
by sku.code
and (if that's not enough) using saveAndFlush()
to store them, ensuring the order of insertion. This should eliminate the circular wait, meaning that now, at least one of the transactions should succeed (the other will probably get a unique constraint violation)
Full solution: if you prefer both transactions to succeed, you would have to acquire a table-level lock for the SKU
table. You should be able to do this using a custom update query:
@Query(value = "LOCK TABLE SKU IN EXCLUSIVE MODE", nativeQuery = true)
@Modifying
void lockTable();
then, just call the method as the first operation inside createSku
. Note that this might turn out to be only marginally more efficient than putting the transactions in a queue, so if I were you, I'd probably still just go with that approach.
EDIT I also didn't quite understand the exact scenario which gives you the consistent result of two transactions clashing, is this a batch insert type of thing that you're trying to parallelize? If you're really bent on running the transactions in parallel, perhaps you could group your input set so that the SKUs don't overlap. Or, deduplicate and insert the Sku
s upfront. As I said, I don't know what the use case is, so not sure if that makes sense.
Thank you, I'll try your solution and get back with result
Seems like you're right, locking is more efficient rather than do that in a queue. Thank you, that helped me