chapter 12 Readable stream, how we control the read() function
On chapter 12 Streams
,
on the Readable streams
, there is an example for the Readable
stream:
```'use strict' const { Readable } = require('stream') const createReadStream = () => { // what if the `data` is a looooooong serialized db or a 100000 length array const data = ['some', 'data', 'to', 'read'] return new Readable({ read () { if (data.length === 0) this.push(null) else this.push(data.shift()) } }) } const readable = createReadStream() readable.on('data', (data) => { console.log('got data', data) }) readable.on('end', () => { console.log('finished reading') })
```
There is a condition, in the read()
function, that we extract the last item of data
array, and when there are no more items left, we push null so to emit the end
event.
I have some questions regarding that approach, and i d like to know how we will approach it in some other cases, like:
- Is this the right approach to monitor the remaining data of an array? By extracting an array item each time, until there are no more ?
- Also, from performance view, if that array has 10000, we ll emit the 'data' event, 10000 times!
- How are we monitoring the remaining data, if the
data
is a large serialised database(string). What condition should we put, intoread()
so to know when to emit the 'data' and when the 'end' event ?
thank you
Best Answer
-
hey @theodoros
Code is always about context, performance isn't always priority #1 - and this is coming from someone who has written, spoken and consulted extensively around performance in Node. This code is optimized for communication, for teaching the general concepts and API of streams. With that in mind:
- Typically readable streams are for connecting with some kind of IO, transmitting data isn't a big use-case beyond test code and example code. A better way to do this is outside of explaining the API is to just use
Readable.from(array)
and you have your readable stream emitting data, then there's no need to be concerned about the details. - Performance isn't a concern here, in fact any time you emit in-memory data from a stream (e.g. in tests) performance tends not to be a concern. One a side note though, streams improve performance for I/O scenarios, particular where you have a large amount of data - they do not improve CPU compute performance. By regulating and processing incremental data, they support an optimal pattern for handling I/O in specific circumstances.
- That depends entirely on context. Consider TCP, it's a protocol with the ability to indicate (among other things) connecting and disconnecting. A stream around TCP (e.g. a
net
socket) would know when to end based on a protocol instruction. If a database supports streaming, its drivers will know how to interpret end of stream, and a streaming implementation around those drivers would take that instruction and turn it into apush(null)
to end the stream
@krave for your questions
- The default high watermarks of 16kb (write) and 64kb (read) tend to be fine, beyond that its a fine tuning exercise that's highly dependant on the context
- That's a huge topic, probably the most trivial approach would be a stream wrapper around an existing streaming media processor, e.g. ffmpeg - This project looks interesting: https://github.com/amishshah/prism-media
0 - Typically readable streams are for connecting with some kind of IO, transmitting data isn't a big use-case beyond test code and example code. A better way to do this is outside of explaining the API is to just use
Answers
-
Hi, @theodoros , I would like to join this conversion because I have relavent confusion too.
Some thoughts about your questions
- Keep an index pointing to where the last item has been read is is my approach to do such tasks. I think that would be more performant.
- The size of each push can be under your control. For example, you can push 10 items each time.
- So as to the scenarios of strings, I will slice the large string into pieces and keep a record of the index from which the stream read last time. Then increment the index increasingly. If the index points out of the string, then I will stop right away. Here is my code.
'use strict' const { Readable } = require('stream') const createReadStream = () => { // what if the `data` is a looooooong serialized db or a 100000 length array const data = '123456789' let index = 0 const step = 6 return new Readable({ read() { if (data.length < index) { this.push(null) } else { this.push(data.slice(index, index + step)) index = index + step } } }) } const readable = createReadStream() readable.on('data', (data) => { console.log('got data:', data.toString()) }) readable.on('end', () => { console.log('finished reading') })
My questions:
1. How big the appropriate size of the chunk should be? I refer to something like thestep
in my code above particularly when chunks are being sent over network.
2. How to stream video data? For example live video streaming. Is there any great references or tutorials?0 -
Oh, didn't know that project before. Thanks!
0 -
np
0
Categories
- All Categories
- 60 LFX Mentorship
- 113 LFX Mentorship: Linux Kernel
- 618 Linux Foundation IT Professional Programs
- 321 Cloud Engineer IT Professional Program
- 142 Advanced Cloud Engineer IT Professional Program
- 55 DevOps Engineer IT Professional Program
- 68 Cloud Native Developer IT Professional Program
- 6 Express Training Courses
- 6 Express Courses - Discussion Forum
- 2.3K Training Courses
- 19 LFC110 Class Forum - Discontinued
- 9 LFC131 Class Forum
- 31 LFD102 Class Forum
- 175 LFD103 Class Forum
- 22 LFD121 Class Forum
- 2 LFD137 Class Forum
- 61 LFD201 Class Forum
- 2 LFD210 Class Forum
- 1 LFD210-CN Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum - Discontinued
- LFD237 Class Forum
- 23 LFD254 Class Forum
- 653 LFD259 Class Forum
- 108 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 4 LFD273 Class Forum
- 2 LFS145 Class Forum
- 28 LFS200 Class Forum
- 740 LFS201 Class Forum - Discontinued
- 1 LFS201-JP クラス フォーラム
- 12 LFS203 Class Forum
- 92 LFS207 Class Forum
- 301 LFS211 Class Forum
- 54 LFS216 Class Forum
- 47 LFS241 Class Forum
- 41 LFS242 Class Forum
- 37 LFS243 Class Forum
- 12 LFS244 Class Forum
- 41 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- LFS251 Class Forum
- 142 LFS253 Class Forum
- LFS254 Class Forum
- LFS255 Class Forum
- LFS256 Class Forum
- LFS257 Class Forum
- 1.2K LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 106 LFS260 Class Forum
- 145 LFS261 Class Forum
- 39 LFS262 Class Forum
- 83 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 20 LFS267 Class Forum
- 18 LFS268 Class Forum
- 26 LFS269 Class Forum
- 204 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- LFS274 Class Forum
- 3 LFS281 Class Forum
- 249 LFW211 Class Forum
- 177 LFW212 Class Forum
- 9 SKF100 Class Forum
- SKF200 Class Forum
- 907 Hardware
- 220 Drivers
- 74 I/O Devices
- 44 Monitors
- 116 Multimedia
- 210 Networking
- 102 Printers & Scanners
- 86 Storage
- 764 Linux Distributions
- 88 Debian
- 66 Fedora
- 15 Linux Mint
- 13 Mageia
- 24 openSUSE
- 143 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 357 Ubuntu
- 482 Linux System Administration
- 40 Cloud Computing
- 70 Command Line/Scripting
- Github systems admin projects
- 95 Linux Security
- 80 Network Management
- 108 System Management
- 51 Web Management
- 72 Mobile Computing
- 25 Android
- 32 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 545 Off Topic
- 132 Introductions
- 223 Small Talk
- 22 Study Material
- 831 Programming and Development
- 282 Kernel Development
- 515 Software Development
- 974 Software
- 260 Applications
- 185 Command Line
- 3 Compiling/Installing
- 119 Games
- 318 Installation
- 65 All In Program
- 65 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)