Oracle Cloud Infrastructure (OCI) Streaming provides a fully managed, scalable, and durable messaging solution for ingesting continuous, high-volume streams of data that you can consume and process in real-time. Streaming is available in all supported Oracle Cloud Infrastructure regions. For a list, visit the Regions and Availability Domains page.
Streaming is a serverless service that offloads the infrastructure management ranging from networking to storage and the configuration needed to stream your data. You do not have to worry about the provisioning of infrastructure, ongoing maintenance, or security patching. The Streaming service synchronously replicates data across three Availability Domains, providing high availability and data durability. In regions with a single Availability Domain, the data is replicated across three Fault Domains.
Streaming makes it easy to collect, store, and process data generated in real -time from hundreds of sources. The number of use cases is nearly unlimited, ranging from messaging to complex data stream processing. Following are some of the many possible uses for Streaming:
You can start using Streaming as follows:
Alternately, you can also use Kafka APIs to produce and consume from a stream. For more information refer to Using Streaming with Apache Kafka.
The throughput of Streaming is designed to scale up without limits by adding partitions to a stream. However, there are certain limits to keep in mind while using Streaming:
Streaming provides stream-based semantics. Stream semantics provide strict ordering guarantees per partition, message replayability, client-side cursors, and massive horizontal scale of throughput. Queues do not offer these features. Queues can be designed to provide ordering guarantees if using FIFO queues, but only at the cost of adding significant overhead in performance.
A stream is a partitioned, append-only log of messages, to which producer applications write data to and from which consumer applications read data.
A stream pool is a grouping that you can use to organize and manage streams. Stream pools provide operational ease by providing an ability to share configuration settings across multiple streams. For example, users can share security settings like custom encryption keys on the stream pool to encrypt the data of all the streams inside the pool. A stream pool also enables you to create a private endpoint for streams by restricting internet access to all of the streams within a stream pool. For customers using Streaming's Kafka compatibility feature, the stream pool serves as the root of a virtual Kafka cluster, thereby enabling every action on that virtual cluster to be scoped to that stream pool.
A partition is a base throughput unit that enables horizontal scale and parallelism of production and consumption from a stream. A partition provides a capacity of 1 MB/sec data input and 2 MB/sec data output. When you create a stream, you specify the number of partitions you need based on the throughput requirements of your application. For example, you can create a stream with 10 partitions, in which case you can achieve a throughput of 10 MB/sec input and 20 MB/sec output from a stream.
A message is a base64-encoded unit of data stored in a stream. The maximum size of a message you can produce to a partition in a stream is 1 MB.
A key is an identifier used to group related messages. Messages with the same key are written to the same partition. Streaming ensures that any consumer of a given partition will always read that partition's messages in exactly the same order as they were written.
A producer is a client application that can write messages to a stream.
A consumer is a client application that can read messages from one or more streams. A consumer group is a set of instances which coordinates messages from all of the partitions in a stream. At any given time, the messages from a specific partition can only be consumed by a single consumer in the group.
A cursor is a pointer to a location in a stream. This location could be a pointer to a specific offset or time in a partition, or to a group's current location.
Each message within a partition has an identifier called offset. Consumers can read messages starting from a specific offset and are allowed to read from any offset point they choose. Consumers can also commit the latest processed offset so they can resume their work without replaying or missing a message if they stop and then restart.
Streaming provides data encryption by default, both at rest and in transit. Streaming is fully integrated with Oracle Cloud Infrastructure Identity and Access Management (IAM), which lets you use access policies to selectively grant permissions to users and groups of users. While using REST APIs, you can also securely PUT and GET your data from Streaming through SSL endpoints with HTTPS protocol. Further, Streaming provides complete tenant-level isolation of data without any "noisy neighbor" problems.
Streaming data is encrypted both at rest and in transit, along with ensuring message integrity. You can let Oracle manage encryption, or use Oracle Cloud Infrastructure Vault to securely store and manage your own encryption keys if you need to meet specific compliance or security standards.
You can edit the stream pool's data encryption settings at any time if you would like to switch between using "Encryption provided by Oracle Keys" and "Encryption managed using Customer Managed Keys". Streaming does not impose any restrictions on how many times this activity can be performed.
Streaming is fully integrated with Oracle Cloud Infrastructure IAM. Every stream has a compartment assigned. Users can specify role-based access control policies that may be used to describe fine-grained rules at a tenancy, compartment, or single-stream level.
Access policy is specified in a form of "Allow <subject> to <verb> <resource-type> in <location> where <conditions>".
Authentication with the Kafka protocol uses auth tokens and the SASL/PLAIN mechanism. You can generate tokens on the console user details page. See Working with Auth Tokens for more information. We recommend you create a dedicated group/user and grant that group the permission to manage streams in the appropriate compartment or tenancy. You then can generate an auth token for the user you created and use it in your Kafka client configuration.
Private endpoints restrict access to a specified virtual cloud network (VCN) within your tenancy so that its streams cannot be accessed through the internet. Private endpoints associate a private IP address within a VCN to the stream pool, allowing Streaming traffic to avoid traversing the internet. To create a private endpoint for Streaming, you need access to a VCN with a private subnet when you create the stream pool. See About Private Endpoints and VCNs and Subnets for more information.
You can write the contents of a stream directly to an Object Storage bucket, typically to persist the data in the stream for long term storage. This can be achieved using Kafka Connect for S3 with Streaming. For more information, see the Publishing To Object Storage From Oracle Streaming Service blog post.
You can ingest data from a table in an Oracle Autonomous AI Transaction Processing instance. For more information, see the Using Kafka Connect With Oracle Streaming Service And Autonomous DB blog post.
You can use the Kafka SDKs to produce and consume messages from Streaming, and you can use Micronaut's built-in support for Kafka. For more information, see the Easy Messaging With Micronaut's Kafka Support And Oracle Streaming Service blog post.
For information, see the Ingest IoT Data from MQTT Brokers into OCI-Oracle Streaming Service, OCI- Kafka Connect Harness, and Oracle Kubernetes Engine blog post.
Oracle GoldenGate for Big Data is now certified to integrate with Streaming. For more information, see Connecting to Oracle Streaming Service in the Oracle GoldenGate for Big Data documentation.
You need to use Kafka JDBC Sink Connect to directly transport streaming data into Oracle Autonomous AI Lakehouse.
Streaming uses simple pay-as-you-use pricing, which ensures you only pay for the resources you use. The pricing dimensions include
Please refer to the OCI Streaming page for the latest pricing information.
Streaming’s industry-leading pricing model ensures that you pay only when you use the service within the default service limits.
Streaming does not charge an additional price for moving data in and out of the service. Further, users can leverage the power of Service Connector Hub to move data to and from Streaming in a serverless manner at no additional price.
Streaming currently doesn't operate in the free tier.
Identity and Access Management lets you control who has access to your cloud resources. To use Oracle Cloud Infrastructure resources, you must be given the required type of access in a policy written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tools. Access policy is specified in the form of
Allow <subject> to <verb> <resource-type> in <location> where <conditions>
Administrators of a tenancy can use the policy
Allow group StreamAdmins to manage streams in tenancy
which lets a specified group StreamAdmins do everything with streaming ranging from creating, updating, listing, and deleting streams and their related resources. However, you can always specify more granular policies so that only select users in a group are eligible for only a subset of activities they can perform on a given stream. If you're new to policies, see Getting Started with Policies and Common Policies. If you want to dig deeper into writing policies for Streaming, see Details for the Streaming Service in the IAM policy reference.
You can provision a stream and all its associated components like IAM policies, partitions, encryption settings, etc., using the Oracle Cloud infrastructure Resource Manager or Terraform provider for Oracle Cloud Infrastructure. For information on the Terraform provider, see