Data Representation

Computer

Computer is a  multimedia device, which stores, represents, modifies numbers, text, audio, images and graphics, and video. Data is stored as Binary Digits, each bites is represented as Binary Strings (a string of 1s and 0s). (Dale and Lewis)

Analog and Digital Data

There are two ways to represent data: Analog data and Digital data. Analog data is a continuous representation of data. Digital data is a discrete representation of data. Digitization is the act of breaking continuous entity (analog signal) down into discrete pieces (digital signal). (Dale and Lewis)

Analog and Digital Signals

Analog Signal continually fluctuates up and down in voltage. Digital Signal has only a high or low state (modem)

%e5%b1%8f%e5%b9%95%e5%bf%ab%e7%85%a7-2016-09-29-%e4%b8%8b%e5%8d%8812-20-07
(Dale and Lewis)

Degradation

Digression is the action of lowering the character or quality. All electronic signals (both analog and digital) degrade as they move down a line. The voltage of the signal fluctuates due to environmental effects. (Pyae)

屏幕快照 2016-09-29 下午12.21.27.png
(Dale and Lewis)

As soon as an analog signal degrades, information is lost. Then the analog signals are no longer reliable. Digital signals  jump between two extremes. Periodically, a digital signal is reclocked (the act of reasserting an original digital signal before too much degradation occurs) to regain its original shape.(Dale and Lewis) That’s why computers use digital data to represent information.

Binary Representation

1 bit represents 2 things (21)

2 bits represent 4 things (22)

3 bits represent 8 things (23)

n bits represent 2n things (Dale and Lewis)

Representing Text

Character set is a list of the characters and the binary string used to represent each one.There are two types of character sets: ASCII and Unicode.

ASCII stands for American Standard Code for Information Interchange. ASCII  used seven bits to represent each character (allowing 128 unique characters) originally. But they charged later. Now, ASCII is using 8 bites (allowing 256 unique characters) to represent text. The undated version of ASCII is called Extended ASCII with 256 characters and includes accented letters as well as several other special symbols (Dale and Lewis)

屏幕快照 2016-09-17 上午12.23.16.png(Dale and Lewis)

Unicode uses 16 bits to represent each character (216=65536). The first 256 characters are same as ASCII.

Standardizing: You may want to ask why computer manufactures create ASCII and Unicode as universal character sets to use. Because it is easy to communicate and transfer data across different platforms.

Data Compression

Data compression is the action of reducing the amount of space needed to store a piece of data.

The reason why we compress files is that it will be easier to share with others by using a smaller file. Also, because of the storage limitation of computers. The amount to members of bits within one file is limited by Bandwidth restrictions, which is the maximum number of bits or bytes that can be transmitted from one place to another in a fixed amount of time.

Compression ratio is the size of the compressed data divided by the size of the uncompressed data. This ratio indicates how much compression occurs. Lossless compression occurs when data can be retrieved without losing. Lossy compression occurs when some information is lost in the process of compaction. In conclusion, we need to trade off between accuracy and size.

Here is an example: Compression techniques such as JPEG and GIF for images and MP3 for sound have been developed to reduce file size while minimizing the deterioration of quality. (Dale and Lewis)

Representing Encoding Images, Audio, and Video

Key Terms:

Pixel: Individual dots (small squares) used to represent a picture

Resolution: The number of pixels used to represent a picture  It is measured in dots per inch (dpi).

Color Depth: expressed the number of bits that are used to represent the color of each pixel (“Encoding Images”)

  • The color depth of an image is measured in bits:
    • 1-bit color depth = 2 (21) available colors (black & white)
    • 2-bit color depth = 4 (22) available colors (black, dark grey, light grey, white)(“Encoding Images”)
屏幕快照 2016-09-17 上午9.51.19.png
("Image And Sound Representation")

There are two ways to store images’ information: bitmaps and vectors.Bitmaps is the way that an image was made up of pixels. This type of image loses quality if its width and/or height are increased. Each color of an image is stored as a binary number. Pictures in Raster-graphics format store image information pixel by pixel), for example:

  • JPEG (Joint Photographic Experts Group, a digital image format, lossy compression)
  • GIF (Graphics interchange format, 8-bit digital image format, lossless compression)
  • PNG (Portable network graphics – a digital image format, lossless compression)

The more bytes of memory that are used to represent an image pixel or a second of sound, the better the reproduction, but there is an increase in file size as well.(“Encoding Images”)

Vector is the way that an image stored its information as mathematical instructions for how to draw it. This means its width and height can be increased without the loss of quality.(“Encoding Images”) SVG (scalable vector graphics, file formats) is an example.

Metadata

Metadata means ‘data about data’, provides information about the image (filename, file format, resolution, color depth, etc)(“Encoding Images”)

Metadata represents encoding audio and video. Digital sound is broken down into thousands of samples per second. Each sound sample is stored as binary data.

Quality Determinants:

  • sample rate – the number of audio samples captured every second, unit is hertz (Hz)
  • bit depth – the number of bits available for each sample (similar to color depth which measures how many colors are available for each pixel)
  • bit rate – the number of bits used per second of audio

Format:

  • FLAC & ALAC: lossless compression formats
  • MP3 and AAC: lossy compression formats (“Encoding Audio And Video”)

Representing Audio and Images

屏幕快照 2016-09-17 上午9.52.26.png
("Image And Sound Representation")

Pulse Code Modulation (PCM) is a method used to digital signals to represent analog signals. Sampling is a process of gaining audio sampling rate. Quantization is the process of breaking down a continuous range of information into a finite range of values. Encoding is a process of converting information or instructions into a particular form. In this case, it is to convert digital signals to analog signals.

Works Cited

“Binary To Decimal”. Carmamovie.com. N.p., 2016. Web. 17 Sept. 2016.

Dale, Nell and John Lewis. Computer Science Illuminated. Print.

“Encoding Audio And Video”. Bbc.co.uk. N.p., 2016. Web. 17 Sept. 2016.

“Encoding Images”. Bbc.co.uk. N.p., 2016. Web. 17 Sept. 2016.

“Image And Sound Representation”. Presentation.

留下评论