This code notebook describes the use of tensor slicing to build higher dimensional neural networks.
In a typical deep neural network architecture designed to handle image data, the input data is represented as three dimensional (3D) tensors. The information represented in these tensors are the pixel values and the color channel information.
Visual representation of a three dimensional tensor for storing a two dimensional color image
The neural network architectures that ingests these 3D tensors are hugely capable of achieving state-of-the-art performance in image classification and segmentation problems.
But, they lack the ability to encode either the contextual information or information regarding the relative hierarchy of an image within a group of images. The contextual information or the information about the larger sequence is especially important for handling complex data structures such as video frames or MRI slices. A logical next step forward in the evolution of neural network capabilities is to add an additional dimension to these input tensors, transforming a three dimensional input tensor into a four dimensional one. This additional dimensional can be leveraged to convey either the contextual information or information about the relative position in a sequence.
Visual representation of a 4D tensor for storing 2D color image sequences
A key challenge to handle these 4D tensor inputs is to build the neural networks that are capable of ingesting these higher dimensional tensors.
Since benchmarking datasets and the model selection are extremely limited for 4D tensor neural networks, an alternate approach instead of building bespoke 4D neural networks, is to re-purpose the existing state-of-the-art 3D networks to ingest these 4D input tensors. A scalable and relatively straightforward approach is to simply stack the 2D color image (3D input tensors) processing neural networks (3D models) as a series of encoders and then pass the output of these neural network encoders to a merge layer for combining all the features from these 4D input tensors.
Visual representation of 3D model stacking to handle 4D input tensors
To sequentially pass each of the 3D component tensors from a 4D tensor input, the tensor slicing functionality comes extremely handy.
An example Python implementation of tensor slicing using a for-loop in Tensorflow
Inside the for-loop created for tensor slicing, the 3D neural networks models can be defined without any changes to the deep neural network APIs.
Defining a stacked 3D model for handling the 4D tensor input
References:
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
OverviewMoad Computer is an actionable insights firm. We provide enterprises with end-to-end artificial intelligence solutions. Actionable Insights blog is a quick overview of things we are most excited about. Archives
November 2022
Categories |