Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more on data io...specially for images. #1

Open
AndreGuerra123 opened this issue Nov 29, 2017 · 1 comment
Open

more on data io...specially for images. #1

AndreGuerra123 opened this issue Nov 29, 2017 · 1 comment

Comments

@AndreGuerra123
Copy link

Hello, Chin Lin.
First of all, nicely structured guide you have here.
( I'm starting with mxnet and there is not so much help in R).
I don't know if this is the correct place (since it is not an issue) to put it but here it goes:
I would like to see more on data io...specially how to build datasets efficiently from images.
The main topics I'm dealing with are:
-convert images to Xdataset (most data efficient way (space and speed));
-optimal structure the final X dataset (HeightxWidthxChannelxObservation) or other;
-normalized the data?from 0 to 255? from 0 to 1?other?
-is there any reference guide or format that should be used(data.matrix or array)?
Hope my feedback help your guide in the future, as your answer will certainly help me.
Kind regards, Andre

@xup6fup
Copy link
Owner

xup6fup commented Dec 2, 2017

Dear Andre,

You can try following codes:

library(mxnet)

CustomCSVIter <- setRefClass("CustomCSVIter",
fields=c("iter", "data.csv", "data.shape", "batch.size"),
contains = "Rcpp_MXArrayDataIter",
methods=list(
initialize=function(iter, data.csv, data.shape, batch.size){
feature_len <- data.shape*data.shape + 1
csv_iter <- mx.io.CSVIter(data.csv=data.csv, data.shape=c(feature_len), batch.size=batch.size)
.self$iter <- csv_iter
.self$data.csv <- data.csv
.self$data.shape <- data.shape
.self$batch.size <- batch.size
.self
},
value=function(){
val <- as.array(.self$iter$value()$data)
val.x <- val[-1,]
val.y <- val[1,]
val.x <- val.x/255
dim(val.x) <- c(data.shape, data.shape, ncol(val.x))
random.x = sample(0:2, 1) + 1:26
random.y = sample(0:2, 1) + 1:26
val.x <- val.x[random.x,random.y,]
dim(val.x) <- c(data.shape - 2, data.shape - 2, 1, ncol(val))
val.x <- mx.nd.array(val.x)
val.y <- mx.nd.array(val.y)
list(data=val.x, label=val.y)
},
iter.next=function(){
.self$iter$iter.next()
},
reset=function(){
.self$iter$reset()
},
num.pad=function(){
.self$iter$num.pad()
},
finalize=function(){
.self$iter$finalize()
}
)
)

batch.size <- 50
train.iter <- CustomCSVIter$new(iter = NULL, data.csv = "mnist_train.csv", data.shape = 28, batch.size = batch.size)

lenet.model <- function(){
data <- mx.symbol.Variable('data')
conv1 <- mx.symbol.Convolution(data=data, kernel=c(5,5), num_filter=20) #first conv
tanh1 <- mx.symbol.Activation(data=conv1, act_type="tanh")
pool1 <- mx.symbol.Pooling(data=tanh1, pool_type="max", kernel=c(2,2), stride=c(2,2))
conv2 <- mx.symbol.Convolution(data=pool1, kernel=c(5,5), num_filter=50)# second conv
tanh2 <- mx.symbol.Activation(data=conv2, act_type="tanh")
pool2 <- mx.symbol.Pooling(data=tanh2, pool_type="max", kernel=c(2,2), stride=c(2,2))
flatten <- mx.symbol.Flatten(data=pool2)
fc1 <- mx.symbol.FullyConnected(data=flatten, num_hidden=100) # first fullc
tanh3 <- mx.symbol.Activation(data=fc1, act_type="tanh")
fc2 <- mx.symbol.FullyConnected(data=tanh3, num_hidden=10) # second fullc
network <- mx.symbol.SoftmaxOutput(data=fc2) # loss
network
}
network <- lenet.model()

n.cpu <- 4
device.cpu <- lapply(0:(n.cpu-1), function(i) {mx.cpu(i)})

model <- mx.model.FeedForward.create(symbol=network,
X=train.iter,
ctx=device.cpu,
num.round=2,
array.batch.size = batch.size,
learning.rate =0.1,
momentum=0.9,
eval.metric=mx.metric.accuracy,
wd=0.00001,
batch.end.callback = mx.callback.log.speedometer(batch.size, frequency = 100)
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants