chapter 12 Readable stream, how we control the read() function
On chapter 12 Streams
,
on the Readable streams
, there is an example for the Readable
stream:
```'use strict' const { Readable } = require('stream') const createReadStream = () => { // what if the `data` is a looooooong serialized db or a 100000 length array const data = ['some', 'data', 'to', 'read'] return new Readable({ read () { if (data.length === 0) this.push(null) else this.push(data.shift()) } }) } const readable = createReadStream() readable.on('data', (data) => { console.log('got data', data) }) readable.on('end', () => { console.log('finished reading') })
```
There is a condition, in the read()
function, that we extract the last item of data
array, and when there are no more items left, we push null so to emit the end
event.
I have some questions regarding that approach, and i d like to know how we will approach it in some other cases, like:
- Is this the right approach to monitor the remaining data of an array? By extracting an array item each time, until there are no more ?
- Also, from performance view, if that array has 10000, we ll emit the 'data' event, 10000 times!
- How are we monitoring the remaining data, if the
data
is a large serialised database(string). What condition should we put, intoread()
so to know when to emit the 'data' and when the 'end' event ?
thank you
Best Answer
-
hey @theodoros
Code is always about context, performance isn't always priority #1 - and this is coming from someone who has written, spoken and consulted extensively around performance in Node. This code is optimized for communication, for teaching the general concepts and API of streams. With that in mind:
- Typically readable streams are for connecting with some kind of IO, transmitting data isn't a big use-case beyond test code and example code. A better way to do this is outside of explaining the API is to just use
Readable.from(array)
and you have your readable stream emitting data, then there's no need to be concerned about the details. - Performance isn't a concern here, in fact any time you emit in-memory data from a stream (e.g. in tests) performance tends not to be a concern. One a side note though, streams improve performance for I/O scenarios, particular where you have a large amount of data - they do not improve CPU compute performance. By regulating and processing incremental data, they support an optimal pattern for handling I/O in specific circumstances.
- That depends entirely on context. Consider TCP, it's a protocol with the ability to indicate (among other things) connecting and disconnecting. A stream around TCP (e.g. a
net
socket) would know when to end based on a protocol instruction. If a database supports streaming, its drivers will know how to interpret end of stream, and a streaming implementation around those drivers would take that instruction and turn it into apush(null)
to end the stream
@krave for your questions
- The default high watermarks of 16kb (write) and 64kb (read) tend to be fine, beyond that its a fine tuning exercise that's highly dependant on the context
- That's a huge topic, probably the most trivial approach would be a stream wrapper around an existing streaming media processor, e.g. ffmpeg - This project looks interesting: https://github.com/amishshah/prism-media
0 - Typically readable streams are for connecting with some kind of IO, transmitting data isn't a big use-case beyond test code and example code. A better way to do this is outside of explaining the API is to just use
Answers
-
Hi, @theodoros , I would like to join this conversion because I have relavent confusion too.
Some thoughts about your questions
- Keep an index pointing to where the last item has been read is is my approach to do such tasks. I think that would be more performant.
- The size of each push can be under your control. For example, you can push 10 items each time.
- So as to the scenarios of strings, I will slice the large string into pieces and keep a record of the index from which the stream read last time. Then increment the index increasingly. If the index points out of the string, then I will stop right away. Here is my code.
'use strict' const { Readable } = require('stream') const createReadStream = () => { // what if the `data` is a looooooong serialized db or a 100000 length array const data = '123456789' let index = 0 const step = 6 return new Readable({ read() { if (data.length < index) { this.push(null) } else { this.push(data.slice(index, index + step)) index = index + step } } }) } const readable = createReadStream() readable.on('data', (data) => { console.log('got data:', data.toString()) }) readable.on('end', () => { console.log('finished reading') })
My questions:
1. How big the appropriate size of the chunk should be? I refer to something like thestep
in my code above particularly when chunks are being sent over network.
2. How to stream video data? For example live video streaming. Is there any great references or tutorials?0 -
Oh, didn't know that project before. Thanks!
0 -
np
0
Categories
- 8.9K All Categories
- 13 LFX Mentorship
- 66 LFX Mentorship: Linux Kernel
- 363 Linux Foundation Boot Camps
- 230 Cloud Engineer Boot Camp
- 70 Advanced Cloud Engineer Boot Camp
- 25 DevOps Engineer Boot Camp
- 5 Cloud Native Developer Boot Camp
- 849 Training Courses
- 15 LFC110 Class Forum
- 16 LFD102 Class Forum
- 102 LFD103 Class Forum
- 3 LFD121 Class Forum
- 55 LFD201 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 19 LFD254 Class Forum
- 431 LFD259 Class Forum
- 85 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 16 LFS200 Class Forum
- 694 LFS201 Class Forum
- LFS201-JP クラス フォーラム
- 271 LFS211 Class Forum
- 50 LFS216 Class Forum
- 26 LFS241 Class Forum
- 27 LFS242 Class Forum
- 19 LFS243 Class Forum
- 6 LFS244 Class Forum
- 9 LFS250 Class Forum
- LFS250-JP クラス フォーラム
- 107 LFS253 Class Forum
- 790 LFS258 Class Forum
- 7 LFS258-JP クラス フォーラム
- 51 LFS260 Class Forum
- 79 LFS261 Class Forum
- 13 LFS262 Class Forum
- 76 LFS263 Class Forum
- 14 LFS264 Class Forum
- 10 LFS266 Class Forum
- 8 LFS267 Class Forum
- 9 LFS268 Class Forum
- 6 LFS269 Class Forum
- 180 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 187 LFW211 Class Forum
- 103 LFW212 Class Forum
- 878 Hardware
- 207 Drivers
- 74 I/O Devices
- 43 Monitors
- 115 Multimedia
- 204 Networking
- 98 Printers & Scanners
- 82 Storage
- 724 Linux Distributions
- 82 Debian
- 64 Fedora
- 12 Linux Mint
- 13 Mageia
- 22 openSUSE
- 126 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 347 Ubuntu
- 447 Linux System Administration
- 33 Cloud Computing
- 64 Command Line/Scripting
- Github systems admin projects
- 89 Linux Security
- 73 Network Management
- 105 System Management
- 45 Web Management
- 50 Mobile Computing
- 18 Android
- 19 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 499 Off Topic
- 119 Introductions
- 193 Small Talk
- 19 Study Material
- 747 Programming and Development
- 240 Kernel Development
- 473 Software Development
- 902 Software
- 247 Applications
- 178 Command Line
- 2 Compiling/Installing
- 72 Games
- 314 Installation
- 20 All In Program
- 20 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)