The BDS Bucket is an AWS S3 Bucket where Brightspace Data Streams puts event objects that couldn’t be written to the normal Kinesis data stream. These are not “erroneous” events; they have passed through the same validation procedures as events posted in the normal stream. Events in the Bucket simply couldn’t be written to the normal stream. Typical reasons include:
The Kinesis stream was disabled for some reason.
BDS was denied write access to the stream (possibly because the stream was erroneously reconfigured).
The stream is not accepting input for some reason.
The Bucket contains exactly the same kind of event objects that are written to the Kinesis stream. It only contains these objects; the Bucket does not contain any sort of error messages to indicate why objects couldn’t be written to the Kinesis stream.
Since the Bucket is a last resort for storing events, the only AWS actions allowed on it are S3.listObjectsV2 (which turns into ListBucket), S3.getObject (which turns into GetObject), and S3.deleteObject (which turns into DeleteObject). When invoking these actions, the RequestPayer property should be set to “requester”.
A lambda for processing events from the Bucket would typically use S3.listObjectsV2 to list the contents of the Bucket, then use S3.getObject calls to obtain the event objects, and finally S3.deleteObject to delete any objects which have been processed successfully.
AWS automatically deletes event objects that have been in the Bucket for more than seven days.
Configuring the Bucket¶
To create the Bucket, follow the AWS documentation for using the management console to create an S3 bucket. You must give BDS permission to write to the bucket; when you enroll with BDS, you will be told which AWS account should be given write permission. Follow the instructions in Setting Bucket Permissions to give BDS write permission.
Processing Events in the Bucket¶
To deal with events in the Bucket, you should create a lambda that reads the events and does something useful with them. There are several ways you might trigger this lambda:
The lambda might be triggered when a notification is received on the BDS SNS topic. The downside of this is that the notification will likely be sent long after events start being written to the Bucket. This means that Bucket events won’t be dealt with in a timely fashion.
The lambda might be triggered by an SQS queue subscribed to the SNS topic. Going through an SQS queue provides more versatility in responding to the situation but has the same type of delay as the previous approach.
The lambda might wake up at regular intervals and use S3.listObjectsV2 to see if the Bucket contains any events. If you set the interval to, say, every 10 seconds, you will get reasonably immediate processing of Bucket events. However, the cost of these repeated invocations would probably be unjustifiable, since the lambda would almost always find the Bucket empty. It would be less expensive for the wake-up interval to be much longer—say every two hours—but that would mean less timely processing.
The general approach to processing Bucket events will be using S3.listObjectV2 to see what events are in the Bucket, then S3.getObject to get batches of events, and S3.deleteObject to delete events once they’ve been dealt with.
One way of dealing with events would be to write them to the main Kinesis stream where the usual lambda would process them. However, this may have drawbacks:
Events get written to the Bucket because they can’t be written to the Kinesis stream. Trying to copy events from the Bucket to the stream might fail for the same reason.
Events in the Bucket may be substantially older than events in the Kinesis stream. It may not be appropriate to process Bucket events in the same way as the “fresh” events in the stream.